Dataset statistics
| Number of variables | 41 |
|---|---|
| Number of observations | 1901539 |
| Missing cells | 18885322 |
| Missing cells (%) | 24.2% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 2.3 GiB |
| Average record size in memory | 1.3 KiB |
Variable types
| DateTime | 1 |
|---|---|
| Unsupported | 2 |
| Text | 12 |
| Numeric | 13 |
| Categorical | 13 |
latitude is highly overall correlated with BoroName | High correlation |
longitude is highly overall correlated with BoroName | High correlation |
number_of_persons_injured is highly overall correlated with number_of_motorist_injured and 1 other fields | High correlation |
number_of_persons_killed is highly overall correlated with number_of_pedestrians_killed and 4 other fields | High correlation |
number_of_pedestrians_injured is highly overall correlated with contributing_factor_vehicle_4 | High correlation |
number_of_pedestrians_killed is highly overall correlated with number_of_persons_killed and 4 other fields | High correlation |
number_of_motorist_injured is highly overall correlated with number_of_persons_injured and 1 other fields | High correlation |
number_of_motorist_killed is highly overall correlated with number_of_persons_killed and 1 other fields | High correlation |
collision_id is highly overall correlated with crash_year | High correlation |
crash_year is highly overall correlated with collision_id | High correlation |
total_injured is highly overall correlated with number_of_persons_injured and 1 other fields | High correlation |
total_killed is highly overall correlated with number_of_persons_killed and 4 other fields | High correlation |
number_of_cyclist_killed is highly overall correlated with number_of_persons_killed and 5 other fields | High correlation |
contributing_factor_vehicle_3 is highly overall correlated with number_of_cyclist_killed and 2 other fields | High correlation |
contributing_factor_vehicle_4 is highly overall correlated with number_of_pedestrians_injured and 3 other fields | High correlation |
contributing_factor_vehicle_5 is highly overall correlated with number_of_pedestrians_killed and 3 other fields | High correlation |
crash_month is highly overall correlated with holiday_name | High correlation |
holiday_name is highly overall correlated with crash_month and 1 other fields | High correlation |
is_public_holiday is highly overall correlated with holiday_name | High correlation |
BoroName is highly overall correlated with latitude and 1 other fields | High correlation |
severity is highly overall correlated with number_of_persons_killed and 2 other fields | High correlation |
number_of_cyclist_injured is highly imbalanced (91.7%) | Imbalance |
number_of_cyclist_killed is highly imbalanced (99.9%) | Imbalance |
is_public_holiday is highly imbalanced (83.7%) | Imbalance |
Number_of_involved_Vehicles is highly imbalanced (51.6%) | Imbalance |
zip_code has 455978 (24.0%) missing values | Missing |
on_street_name has 404419 (21.3%) missing values | Missing |
cross_street_name has 714151 (37.6%) missing values | Missing |
off_street_name has 1558366 (82.0%) missing values | Missing |
contributing_factor_vehicle_1 has 642751 (33.8%) missing values | Missing |
contributing_factor_vehicle_2 has 1650130 (86.8%) missing values | Missing |
contributing_factor_vehicle_3 has 1892805 (99.5%) missing values | Missing |
contributing_factor_vehicle_4 has 1899878 (99.9%) missing values | Missing |
contributing_factor_vehicle_5 has 1901079 (> 99.9%) missing values | Missing |
vehicle_type_code_2 has 375446 (19.7%) missing values | Missing |
vehicle_type_code_3 has 1770054 (93.1%) missing values | Missing |
vehicle_type_code_4 has 1871192 (98.4%) missing values | Missing |
vehicle_type_code_5 has 1893051 (99.6%) missing values | Missing |
holiday_name has 1856022 (97.6%) missing values | Missing |
number_of_persons_killed is highly skewed (γ1 = 33.86868759) | Skewed |
number_of_pedestrians_killed is highly skewed (γ1 = 42.47340165) | Skewed |
number_of_motorist_killed is highly skewed (γ1 = 54.45987653) | Skewed |
total_killed is highly skewed (γ1 = 34.1523306) | Skewed |
collision_id has unique values | Unique |
crash_time is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
geometry is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
number_of_persons_injured has 1451297 (76.3%) zeros | Zeros |
number_of_persons_killed has 1898797 (99.9%) zeros | Zeros |
number_of_pedestrians_injured has 1797087 (94.5%) zeros | Zeros |
number_of_pedestrians_killed has 1900129 (99.9%) zeros | Zeros |
number_of_motorist_injured has 1616538 (85.0%) zeros | Zeros |
number_of_motorist_killed has 1900485 (99.9%) zeros | Zeros |
crash_hour has 62782 (3.3%) zeros | Zeros |
total_injured has 1451188 (76.3%) zeros | Zeros |
total_killed has 1898794 (99.9%) zeros | Zeros |
Reproduction
| Analysis started | 2025-04-25 23:00:00.468255 |
|---|---|
| Analysis finished | 2025-04-25 23:06:56.779568 |
| Duration | 6 minutes and 56.31 seconds |
| Software version | ydata-profiling vv4.5.1 |
| Download configuration | config.json |
crash_date
Date
| Distinct | 4672 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 29.0 MiB |
| Minimum | 2012-07-01 00:00:00 |
|---|---|
| Maximum | 2025-04-15 00:00:00 |
crash_time
Unsupported
REJECTED  UNSUPPORTED 
| Missing | 0 |
|---|---|
| Missing (%) | 0.0% |
| Memory size | 101.6 MiB |
zip_code
Text
MISSING 
| Distinct | 233 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 455978 |
| Missing (%) | 24.0% |
| Memory size | 113.9 MiB |
Length
| Max length | 5 |
|---|---|
| Median length | 5 |
| Mean length | 5 |
| Min length | 5 |
Characters and Unicode
| Total characters | 7227805 |
|---|---|
| Distinct characters | 11 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 4 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | 11230 |
|---|---|
| 2nd row | 11208 |
| 3rd row | 10475 |
| 4th row | 11207 |
| 5th row | 10017 |
| Value | Count | Frequency (%) |
| 11207 | 28482 | 2.0% |
| 11236 | 19822 | 1.4% |
| 11101 | 19396 | 1.3% |
| 11203 | 18806 | 1.3% |
| 11234 | 18286 | 1.3% |
| 11385 | 18061 | 1.2% |
| 11208 | 17665 | 1.2% |
| 11212 | 17471 | 1.2% |
| 11226 | 17305 | 1.2% |
| 11201 | 17264 | 1.2% |
| Other values (222) | 1252972 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 2807279 | |
| 0 | 1274113 | |
| 2 | 850683 | 11.8% |
| 3 | 633215 | 8.8% |
| 4 | 516483 | 7.1% |
| 6 | 322852 | 4.5% |
| 5 | 284288 | 3.9% |
| 7 | 246661 | 3.4% |
| 8 | 152087 | 2.1% |
| 9 | 139989 | 1.9% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 7227650 | |
| Space Separator | 155 | < 0.1% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 2807279 | |
| 0 | 1274113 | |
| 2 | 850683 | 11.8% |
| 3 | 633215 | 8.8% |
| 4 | 516483 | 7.1% |
| 6 | 322852 | 4.5% |
| 5 | 284288 | 3.9% |
| 7 | 246661 | 3.4% |
| 8 | 152087 | 2.1% |
| 9 | 139989 | 1.9% |
Space Separator
| Value | Count | Frequency (%) |
| 155 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 7227805 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 1 | 2807279 | |
| 0 | 1274113 | |
| 2 | 850683 | 11.8% |
| 3 | 633215 | 8.8% |
| 4 | 516483 | 7.1% |
| 6 | 322852 | 4.5% |
| 5 | 284288 | 3.9% |
| 7 | 246661 | 3.4% |
| 8 | 152087 | 2.1% |
| 9 | 139989 | 1.9% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 7227805 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1 | 2807279 | |
| 0 | 1274113 | |
| 2 | 850683 | 11.8% |
| 3 | 633215 | 8.8% |
| 4 | 516483 | 7.1% |
| 6 | 322852 | 4.5% |
| 5 | 284288 | 3.9% |
| 7 | 246661 | 3.4% |
| 8 | 152087 | 2.1% |
| 9 | 139989 | 1.9% |
latitude
Real number (ℝ)
HIGH CORRELATION 
| Distinct | 128281 |
|---|---|
| Distinct (%) | 6.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 40.723911 |
| Minimum | 40.498947 |
|---|---|
| Maximum | 40.912884 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 29.0 MiB |
Quantile statistics
| Minimum | 40.498947 |
|---|---|
| 5-th percentile | 40.597683 |
| Q1 | 40.667915 |
| median | 40.720673 |
| Q3 | 40.769653 |
| 95-th percentile | 40.862152 |
| Maximum | 40.912884 |
| Range | 0.413937 |
| Interquartile range (IQR) | 0.1017376 |
Descriptive statistics
| Standard deviation | 0.079187767 |
|---|---|
| Coefficient of variation (CV) | 0.001944503 |
| Kurtosis | -0.55754253 |
| Mean | 40.723911 |
| Median Absolute Deviation (MAD) | 0.0512144 |
| Skewness | 0.11440902 |
| Sum | 77438106 |
| Variance | 0.0062707024 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 40.861862 | 914 | < 0.1% |
| 40.696033 | 801 | < 0.1% |
| 40.759308 | 631 | < 0.1% |
| 40.8047 | 597 | < 0.1% |
| 40.675735 | 589 | < 0.1% |
| 40.6960346 | 587 | < 0.1% |
| 40.658577 | 547 | < 0.1% |
| 40.75898 | 500 | < 0.1% |
| 40.69168 | 491 | < 0.1% |
| 40.7606005 | 474 | < 0.1% |
| Other values (128271) | 1895408 |
| Value | Count | Frequency (%) |
| 40.498947 | 1 | |
| 40.4989488 | 2 | |
| 40.4991346 | 1 | |
| 40.49931 | 1 | |
| 40.4994787 | 1 | |
| 40.499659 | 1 | |
| 40.499672 | 1 | |
| 40.49971 | 1 | |
| 40.49984 | 1 | |
| 40.499842 | 2 |
| Value | Count | Frequency (%) |
| 40.912884 | 13 | |
| 40.9128276 | 1 | < 0.1% |
| 40.912827 | 2 | < 0.1% |
| 40.912647 | 1 | < 0.1% |
| 40.91257 | 1 | < 0.1% |
| 40.912537 | 2 | < 0.1% |
| 40.9124681 | 24 | |
| 40.912468 | 18 | |
| 40.912292 | 1 | < 0.1% |
| 40.9122231 | 4 | < 0.1% |
longitude
Real number (ℝ)
HIGH CORRELATION 
| Distinct | 99584 |
|---|---|
| Distinct (%) | 5.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | -73.920058 |
| Minimum | -74.25496 |
|---|---|
| Maximum | -73.70055 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 1901539 |
| Negative (%) | 100.0% |
| Memory size | 29.0 MiB |
Quantile statistics
| Minimum | -74.25496 |
|---|---|
| 5-th percentile | -74.03552 |
| Q1 | -73.974754 |
| median | -73.9271 |
| Q3 | -73.86719 |
| 95-th percentile | -73.765072 |
| Maximum | -73.70055 |
| Range | 0.55441 |
| Interquartile range (IQR) | 0.107564 |
Descriptive statistics
| Standard deviation | 0.086177891 |
|---|---|
| Coefficient of variation (CV) | -0.0011658255 |
| Kurtosis | 0.8578712 |
| Mean | -73.920058 |
| Median Absolute Deviation (MAD) | 0.05244 |
| Skewness | -0.20288511 |
| Sum | -1.4056187 × 108 |
| Variance | 0.007426629 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| -73.89063 | 782 | < 0.1% |
| -73.98453 | 717 | < 0.1% |
| -73.91282 | 717 | < 0.1% |
| -73.89686 | 678 | < 0.1% |
| -73.91243 | 654 | < 0.1% |
| -73.94476 | 624 | < 0.1% |
| -73.9112 | 594 | < 0.1% |
| -73.9845292 | 587 | < 0.1% |
| -73.882744 | 552 | < 0.1% |
| -73.91727 | 543 | < 0.1% |
| Other values (99574) | 1895091 |
| Value | Count | Frequency (%) |
| -74.25496 | 1 | < 0.1% |
| -74.254845 | 1 | < 0.1% |
| -74.2545316 | 1 | < 0.1% |
| -74.25393 | 2 | |
| -74.253174 | 1 | < 0.1% |
| -74.2530308 | 1 | < 0.1% |
| -74.253006 | 2 | |
| -74.2529994 | 2 | |
| -74.252884 | 1 | < 0.1% |
| -74.2528764 | 3 |
| Value | Count | Frequency (%) |
| -73.70055 | 2 | < 0.1% |
| -73.700584 | 11 | |
| -73.7005968 | 10 | |
| -73.7006 | 1 | < 0.1% |
| -73.70061 | 5 | |
| -73.70071 | 4 | < 0.1% |
| -73.70073 | 1 | < 0.1% |
| -73.70074 | 1 | < 0.1% |
| -73.70076 | 2 | < 0.1% |
| -73.7007673 | 1 | < 0.1% |
location
Text
| Distinct | 311956 |
|---|---|
| Distinct (%) | 16.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 159.2 MiB |
Length
| Max length | 25 |
|---|---|
| Median length | 24 |
| Mean length | 22.77039 |
| Min length | 16 |
Characters and Unicode
| Total characters | 43298784 |
|---|---|
| Distinct characters | 16 |
| Distinct categories | 6 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 177073 ? |
|---|---|
| Unique (%) | 9.3% |
Sample
| 1st row | (40.62179, -73.970024) |
|---|---|
| 2nd row | (40.667202, -73.8665) |
| 3rd row | (40.709183, -73.956825) |
| 4th row | (40.86816, -73.83148) |
| 5th row | (40.67172, -73.8971) |
| Value | Count | Frequency (%) |
| 40.861862 | 914 | < 0.1% |
| 40.696033 | 801 | < 0.1% |
| 73.89063 | 782 | < 0.1% |
| 73.91282 | 717 | < 0.1% |
| 73.98453 | 717 | < 0.1% |
| 73.89686 | 678 | < 0.1% |
| 73.91243 | 654 | < 0.1% |
| 40.759308 | 631 | < 0.1% |
| 73.94476 | 624 | < 0.1% |
| 40.8047 | 597 | < 0.1% |
| Other values (227855) | 3795963 |
Most occurring characters
| Value | Count | Frequency (%) |
| 7 | 4744864 | |
| 4 | 4113725 | 9.5% |
| . | 3803078 | 8.8% |
| 3 | 3614874 | 8.3% |
| 0 | 3492887 | 8.1% |
| 9 | 2782758 | 6.4% |
| 8 | 2734348 | 6.3% |
| 6 | 2705540 | 6.2% |
| 5 | 2161947 | 5.0% |
| ( | 1901539 | 4.4% |
| Other values (6) | 11243224 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 29988011 | |
| Other Punctuation | 5704617 | 13.2% |
| Open Punctuation | 1901539 | 4.4% |
| Space Separator | 1901539 | 4.4% |
| Dash Punctuation | 1901539 | 4.4% |
| Close Punctuation | 1901539 | 4.4% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 7 | 4744864 | |
| 4 | 4113725 | |
| 3 | 3614874 | |
| 0 | 3492887 | |
| 9 | 2782758 | |
| 8 | 2734348 | |
| 6 | 2705540 | |
| 5 | 2161947 | |
| 2 | 1836428 | 6.1% |
| 1 | 1800640 | 6.0% |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 3803078 | |
| , | 1901539 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 1901539 |
Space Separator
| Value | Count | Frequency (%) |
| 1901539 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 1901539 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 1901539 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 43298784 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 7 | 4744864 | |
| 4 | 4113725 | 9.5% |
| . | 3803078 | 8.8% |
| 3 | 3614874 | 8.3% |
| 0 | 3492887 | 8.1% |
| 9 | 2782758 | 6.4% |
| 8 | 2734348 | 6.3% |
| 6 | 2705540 | 6.2% |
| 5 | 2161947 | 5.0% |
| ( | 1901539 | 4.4% |
| Other values (6) | 11243224 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 43298784 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 7 | 4744864 | |
| 4 | 4113725 | 9.5% |
| . | 3803078 | 8.8% |
| 3 | 3614874 | 8.3% |
| 0 | 3492887 | 8.1% |
| 9 | 2782758 | 6.4% |
| 8 | 2734348 | 6.3% |
| 6 | 2705540 | 6.2% |
| 5 | 2161947 | 5.0% |
| ( | 1901539 | 4.4% |
| Other values (6) | 11243224 |
on_street_name
Text
MISSING 
| Distinct | 15502 |
|---|---|
| Distinct (%) | 1.0% |
| Missing | 404419 |
| Missing (%) | 21.3% |
| Memory size | 149.5 MiB |
Length
| Max length | 32 |
|---|---|
| Median length | 32 |
| Mean length | 28.918011 |
| Min length | 4 |
Characters and Unicode
| Total characters | 43293733 |
|---|---|
| Distinct characters | 71 |
| Distinct categories | 8 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 3205 ? |
|---|---|
| Unique (%) | 0.2% |
Sample
| 1st row | OCEAN PARKWAY |
|---|---|
| 2nd row | BROOKLYN QUEENS EXPRESSWAY |
| 3rd row | 3 AVENUE |
| 4th row | MYRTLE AVENUE |
| 5th row | SPRINGFIELD BOULEVARD |
| Value | Count | Frequency (%) |
| avenue | 565008 | 16.5% |
| street | 492131 | 14.3% |
| east | 144988 | 4.2% |
| boulevard | 109706 | 3.2% |
| west | 108140 | 3.2% |
| parkway | 63394 | 1.8% |
| road | 58543 | 1.7% |
| expressway | 51508 | 1.5% |
| island | 28031 | 0.8% |
| queens | 22604 | 0.7% |
| Other values (4400) | 1787256 |
Most occurring characters
| Value | Count | Frequency (%) |
| 24568140 | ||
| E | 3363038 | 7.8% |
| A | 1750796 | 4.0% |
| T | 1707359 | 3.9% |
| R | 1461187 | 3.4% |
| S | 1291668 | 3.0% |
| N | 1290048 | 3.0% |
| U | 882479 | 2.0% |
| V | 784172 | 1.8% |
| O | 751465 | 1.7% |
| Other values (61) | 5443381 | 12.6% |
Most occurring categories
| Value | Count | Frequency (%) |
| Space Separator | 24568140 | |
| Uppercase Letter | 17545098 | |
| Decimal Number | 1123290 | 2.6% |
| Lowercase Letter | 50286 | 0.1% |
| Open Punctuation | 2467 | < 0.1% |
| Close Punctuation | 2466 | < 0.1% |
| Other Punctuation | 1924 | < 0.1% |
| Dash Punctuation | 62 | < 0.1% |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| E | 3363038 | |
| A | 1750796 | |
| T | 1707359 | |
| R | 1461187 | 8.3% |
| S | 1291668 | 7.4% |
| N | 1290048 | 7.4% |
| U | 882479 | 5.0% |
| V | 784172 | 4.5% |
| O | 751465 | 4.3% |
| L | 573758 | 3.3% |
| Other values (16) | 3689128 |
Lowercase Letter
| Value | Count | Frequency (%) |
| n | 5225 | 10.4% |
| e | 4801 | 9.5% |
| r | 4454 | 8.9% |
| y | 4124 | 8.2% |
| a | 3493 | 6.9% |
| o | 3355 | 6.7% |
| l | 2893 | 5.8% |
| s | 2857 | 5.7% |
| k | 2631 | 5.2% |
| t | 2291 | 4.6% |
| Other values (16) | 14162 |
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 254436 | |
| 3 | 127110 | |
| 2 | 124958 | |
| 4 | 106502 | |
| 5 | 104186 | |
| 6 | 91682 | 8.2% |
| 8 | 84212 | 7.5% |
| 7 | 83740 | 7.5% |
| 9 | 74775 | 6.7% |
| 0 | 71689 | 6.4% |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 1510 | |
| / | 383 | 19.9% |
| ' | 26 | 1.4% |
| & | 4 | 0.2% |
| # | 1 | 0.1% |
Space Separator
| Value | Count | Frequency (%) |
| 24568140 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 2467 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 2466 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 62 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 25698349 | |
| Latin | 17595384 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| E | 3363038 | |
| A | 1750796 | |
| T | 1707359 | |
| R | 1461187 | 8.3% |
| S | 1291668 | 7.3% |
| N | 1290048 | 7.3% |
| U | 882479 | 5.0% |
| V | 784172 | 4.5% |
| O | 751465 | 4.3% |
| L | 573758 | 3.3% |
| Other values (42) | 3739414 |
Common
| Value | Count | Frequency (%) |
| 24568140 | ||
| 1 | 254436 | 1.0% |
| 3 | 127110 | 0.5% |
| 2 | 124958 | 0.5% |
| 4 | 106502 | 0.4% |
| 5 | 104186 | 0.4% |
| 6 | 91682 | 0.4% |
| 8 | 84212 | 0.3% |
| 7 | 83740 | 0.3% |
| 9 | 74775 | 0.3% |
| Other values (9) | 78608 | 0.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 43293733 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 24568140 | ||
| E | 3363038 | 7.8% |
| A | 1750796 | 4.0% |
| T | 1707359 | 3.9% |
| R | 1461187 | 3.4% |
| S | 1291668 | 3.0% |
| N | 1290048 | 3.0% |
| U | 882479 | 2.0% |
| V | 784172 | 1.8% |
| O | 751465 | 1.7% |
| Other values (61) | 5443381 | 12.6% |
MISSING 
| Distinct | 17646 |
|---|---|
| Distinct (%) | 1.5% |
| Missing | 714151 |
| Missing (%) | 37.6% |
| Memory size | 125.9 MiB |
Length
| Max length | 32 |
|---|---|
| Median length | 31 |
| Mean length | 22.162937 |
| Min length | 3 |
Characters and Unicode
| Total characters | 26316005 |
|---|---|
| Distinct characters | 66 |
| Distinct categories | 6 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 3115 ? |
|---|---|
| Unique (%) | 0.3% |
Sample
| 1st row | AVENUE K |
|---|---|
| 2nd row | EAST 43 STREET |
| 3rd row | EAST GATE PLAZA |
| 4th row | 150 STREET |
| 5th row | HEATH AVENUE |
| Value | Count | Frequency (%) |
| avenue | 526600 | |
| street | 427312 | 16.3% |
| east | 105100 | 4.0% |
| west | 65102 | 2.5% |
| boulevard | 58025 | 2.2% |
| road | 47113 | 1.8% |
| place | 31235 | 1.2% |
| 3 | 18046 | 0.7% |
| parkway | 17055 | 0.7% |
| broadway | 16373 | 0.6% |
| Other values (4663) | 1303925 |
Most occurring characters
| Value | Count | Frequency (%) |
| 12596230 | ||
| E | 2703052 | 10.3% |
| T | 1352338 | 5.1% |
| A | 1277342 | 4.9% |
| R | 1015168 | 3.9% |
| N | 981532 | 3.7% |
| S | 897882 | 3.4% |
| U | 714689 | 2.7% |
| V | 658531 | 2.5% |
| O | 511547 | 1.9% |
| Other values (56) | 3607694 | 13.7% |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 12703853 | |
| Space Separator | 12596230 | |
| Decimal Number | 1014388 | 3.9% |
| Lowercase Letter | 1470 | < 0.1% |
| Other Punctuation | 53 | < 0.1% |
| Dash Punctuation | 11 | < 0.1% |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| E | 2703052 | |
| T | 1352338 | |
| A | 1277342 | |
| R | 1015168 | 8.0% |
| N | 981532 | 7.7% |
| S | 897882 | 7.1% |
| U | 714689 | 5.6% |
| V | 658531 | 5.2% |
| O | 511547 | 4.0% |
| L | 389902 | 3.1% |
| Other values (16) | 2201870 |
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 279 | |
| t | 174 | |
| r | 129 | |
| a | 129 | |
| n | 98 | 6.7% |
| s | 97 | 6.6% |
| v | 75 | 5.1% |
| o | 73 | 5.0% |
| l | 66 | 4.5% |
| d | 60 | 4.1% |
| Other values (14) | 290 |
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 224868 | |
| 2 | 119158 | |
| 3 | 111500 | |
| 5 | 91639 | |
| 4 | 91280 | |
| 7 | 80936 | 8.0% |
| 8 | 80374 | 7.9% |
| 6 | 79821 | 7.9% |
| 9 | 69731 | 6.9% |
| 0 | 65081 | 6.4% |
Other Punctuation
| Value | Count | Frequency (%) |
| ' | 43 | |
| & | 5 | 9.4% |
| / | 4 | 7.5% |
| . | 1 | 1.9% |
Space Separator
| Value | Count | Frequency (%) |
| 12596230 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 11 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 13610682 | |
| Latin | 12705323 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| E | 2703052 | |
| T | 1352338 | |
| A | 1277342 | |
| R | 1015168 | 8.0% |
| N | 981532 | 7.7% |
| S | 897882 | 7.1% |
| U | 714689 | 5.6% |
| V | 658531 | 5.2% |
| O | 511547 | 4.0% |
| L | 389902 | 3.1% |
| Other values (40) | 2203340 |
Common
| Value | Count | Frequency (%) |
| 12596230 | ||
| 1 | 224868 | 1.7% |
| 2 | 119158 | 0.9% |
| 3 | 111500 | 0.8% |
| 5 | 91639 | 0.7% |
| 4 | 91280 | 0.7% |
| 7 | 80936 | 0.6% |
| 8 | 80374 | 0.6% |
| 6 | 79821 | 0.6% |
| 9 | 69731 | 0.5% |
| Other values (6) | 65145 | 0.5% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 26316005 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 12596230 | ||
| E | 2703052 | 10.3% |
| T | 1352338 | 5.1% |
| A | 1277342 | 4.9% |
| R | 1015168 | 3.9% |
| N | 981532 | 3.7% |
| S | 897882 | 3.4% |
| U | 714689 | 2.7% |
| V | 658531 | 2.5% |
| O | 511547 | 1.9% |
| Other values (56) | 3607694 | 13.7% |
off_street_name
Text
MISSING 
| Distinct | 225364 |
|---|---|
| Distinct (%) | 65.7% |
| Missing | 1558366 |
| Missing (%) | 82.0% |
| Memory size | 92.1 MiB |
Length
| Max length | 40 |
|---|---|
| Median length | 40 |
| Mean length | 34.838064 |
| Min length | 8 |
Characters and Unicode
| Total characters | 11955483 |
|---|---|
| Distinct characters | 72 |
| Distinct categories | 9 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 175768 ? |
|---|---|
| Unique (%) | 51.2% |
Sample
| 1st row | 1211 LORING AVENUE |
|---|---|
| 2nd row | 344 BAYCHESTER AVENUE |
| 3rd row | 2047 PITKIN AVENUE |
| 4th row | 480 DEAN STREET |
| 5th row | 878 FLATBUSH AVENUE |
| Value | Count | Frequency (%) |
| avenue | 136217 | 12.2% |
| street | 126509 | 11.4% |
| east | 32883 | 3.0% |
| west | 23418 | 2.1% |
| boulevard | 20986 | 1.9% |
| road | 15257 | 1.4% |
| place | 6516 | 0.6% |
| parkway | 6500 | 0.6% |
| broadway | 5345 | 0.5% |
| ave | 4987 | 0.4% |
| Other values (25183) | 734301 |
Most occurring characters
| Value | Count | Frequency (%) |
| 6526437 | ||
| E | 784389 | 6.6% |
| T | 426899 | 3.6% |
| A | 388840 | 3.3% |
| R | 320149 | 2.7% |
| N | 283655 | 2.4% |
| S | 283088 | 2.4% |
| 1 | 277555 | 2.3% |
| U | 198786 | 1.7% |
| 2 | 188384 | 1.6% |
| Other values (62) | 2277301 | 19.0% |
Most occurring categories
| Value | Count | Frequency (%) |
| Space Separator | 6526437 | |
| Uppercase Letter | 3894873 | |
| Decimal Number | 1449113 | 12.1% |
| Dash Punctuation | 80307 | 0.7% |
| Other Punctuation | 3068 | < 0.1% |
| Lowercase Letter | 1043 | < 0.1% |
| Open Punctuation | 322 | < 0.1% |
| Close Punctuation | 319 | < 0.1% |
| Control | 1 | < 0.1% |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| E | 784389 | |
| T | 426899 | |
| A | 388840 | |
| R | 320149 | |
| N | 283655 | 7.3% |
| S | 283088 | 7.3% |
| U | 198786 | 5.1% |
| V | 182842 | 4.7% |
| O | 167178 | 4.3% |
| L | 126822 | 3.3% |
| Other values (16) | 732225 |
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 197 | |
| t | 134 | |
| v | 85 | |
| n | 82 | 7.9% |
| r | 78 | 7.5% |
| a | 68 | 6.5% |
| o | 58 | 5.6% |
| s | 50 | 4.8% |
| d | 43 | 4.1% |
| h | 42 | 4.0% |
| Other values (14) | 206 |
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 277555 | |
| 2 | 188384 | |
| 0 | 160265 | |
| 3 | 148908 | |
| 5 | 145623 | |
| 4 | 130773 | |
| 6 | 105677 | 7.3% |
| 7 | 103035 | 7.1% |
| 8 | 98380 | 6.8% |
| 9 | 90513 | 6.2% |
Other Punctuation
| Value | Count | Frequency (%) |
| / | 2675 | |
| & | 240 | 7.8% |
| . | 120 | 3.9% |
| @ | 18 | 0.6% |
| ' | 11 | 0.4% |
| * | 3 | 0.1% |
| : | 1 | < 0.1% |
Space Separator
| Value | Count | Frequency (%) |
| 6526437 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 80307 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 322 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 319 |
Control
| Value | Count | Frequency (%) |
| 1 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 8059567 | |
| Latin | 3895916 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| E | 784389 | |
| T | 426899 | |
| A | 388840 | |
| R | 320149 | |
| N | 283655 | 7.3% |
| S | 283088 | 7.3% |
| U | 198786 | 5.1% |
| V | 182842 | 4.7% |
| O | 167178 | 4.3% |
| L | 126822 | 3.3% |
| Other values (40) | 733268 |
Common
| Value | Count | Frequency (%) |
| 6526437 | ||
| 1 | 277555 | 3.4% |
| 2 | 188384 | 2.3% |
| 0 | 160265 | 2.0% |
| 3 | 148908 | 1.8% |
| 5 | 145623 | 1.8% |
| 4 | 130773 | 1.6% |
| 6 | 105677 | 1.3% |
| 7 | 103035 | 1.3% |
| 8 | 98380 | 1.2% |
| Other values (12) | 174530 | 2.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 11955483 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 6526437 | ||
| E | 784389 | 6.6% |
| T | 426899 | 3.6% |
| A | 388840 | 3.3% |
| R | 320149 | 2.7% |
| N | 283655 | 2.4% |
| S | 283088 | 2.4% |
| 1 | 277555 | 2.3% |
| U | 198786 | 1.7% |
| 2 | 188384 | 1.6% |
| Other values (62) | 2277301 | 19.0% |
number_of_persons_injured
Real number (ℝ)
HIGH CORRELATION  ZEROS 
| Distinct | 30 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.32015068 |
| Minimum | 0 |
|---|---|
| Maximum | 43 |
| Zeros | 1451297 |
| Zeros (%) | 76.3% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 30.8 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 2 |
| Maximum | 43 |
| Range | 43 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.70602942 |
|---|---|
| Coefficient of variation (CV) | 2.2053035 |
| Kurtosis | 44.513387 |
| Mean | 0.32015068 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 4.0766332 |
| Sum | 608779 |
| Variance | 0.49847754 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 1451297 | |
| 1 | 350081 | 18.4% |
| 2 | 65567 | 3.4% |
| 3 | 21411 | 1.1% |
| 4 | 7862 | 0.4% |
| 5 | 2991 | 0.2% |
| 6 | 1232 | 0.1% |
| 7 | 514 | < 0.1% |
| 8 | 234 | < 0.1% |
| 9 | 111 | < 0.1% |
| Other values (20) | 239 | < 0.1% |
| Value | Count | Frequency (%) |
| 0 | 1451297 | |
| 1 | 350081 | 18.4% |
| 2 | 65567 | 3.4% |
| 3 | 21411 | 1.1% |
| 4 | 7862 | 0.4% |
| 5 | 2991 | 0.2% |
| 6 | 1232 | 0.1% |
| 7 | 514 | < 0.1% |
| 8 | 234 | < 0.1% |
| 9 | 111 | < 0.1% |
| Value | Count | Frequency (%) |
| 43 | 1 | < 0.1% |
| 34 | 1 | < 0.1% |
| 32 | 1 | < 0.1% |
| 27 | 1 | < 0.1% |
| 25 | 1 | < 0.1% |
| 24 | 3 | |
| 23 | 1 | < 0.1% |
| 22 | 3 | |
| 21 | 1 | < 0.1% |
| 20 | 2 |
number_of_persons_killed
Real number (ℝ)
HIGH CORRELATION  SKEWED  ZEROS 
| Distinct | 7 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.0015050967 |
| Minimum | 0 |
|---|---|
| Maximum | 8 |
| Zeros | 1898797 |
| Zeros (%) | 99.9% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 30.8 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 0 |
| Maximum | 8 |
| Range | 8 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.041033381 |
|---|---|
| Coefficient of variation (CV) | 27.262954 |
| Kurtosis | 1955.7404 |
| Mean | 0.0015050967 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 33.868688 |
| Sum | 2862 |
| Variance | 0.0016837383 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 1898797 | |
| 1 | 2652 | 0.1% |
| 2 | 71 | < 0.1% |
| 3 | 13 | < 0.1% |
| 4 | 4 | < 0.1% |
| 8 | 1 | < 0.1% |
| 5 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 0 | 1898797 | |
| 1 | 2652 | 0.1% |
| 2 | 71 | < 0.1% |
| 3 | 13 | < 0.1% |
| 4 | 4 | < 0.1% |
| 5 | 1 | < 0.1% |
| 8 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 8 | 1 | < 0.1% |
| 5 | 1 | < 0.1% |
| 4 | 4 | < 0.1% |
| 3 | 13 | < 0.1% |
| 2 | 71 | < 0.1% |
| 1 | 2652 | 0.1% |
| 0 | 1898797 |
number_of_pedestrians_injured
Real number (ℝ)
HIGH CORRELATION  ZEROS 
| Distinct | 14 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.05736038 |
| Minimum | 0 |
|---|---|
| Maximum | 27 |
| Zeros | 1797087 |
| Zeros (%) | 94.5% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 29.0 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 1 |
| Maximum | 27 |
| Range | 27 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.24614218 |
|---|---|
| Coefficient of variation (CV) | 4.2911532 |
| Kurtosis | 134.62238 |
| Mean | 0.05736038 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 5.735457 |
| Sum | 109073 |
| Variance | 0.060585974 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 1797087 | |
| 1 | 100529 | 5.3% |
| 2 | 3463 | 0.2% |
| 3 | 356 | < 0.1% |
| 4 | 58 | < 0.1% |
| 5 | 22 | < 0.1% |
| 6 | 11 | < 0.1% |
| 7 | 6 | < 0.1% |
| 9 | 2 | < 0.1% |
| 19 | 1 | < 0.1% |
| Other values (4) | 4 | < 0.1% |
| Value | Count | Frequency (%) |
| 0 | 1797087 | |
| 1 | 100529 | 5.3% |
| 2 | 3463 | 0.2% |
| 3 | 356 | < 0.1% |
| 4 | 58 | < 0.1% |
| 5 | 22 | < 0.1% |
| 6 | 11 | < 0.1% |
| 7 | 6 | < 0.1% |
| 8 | 1 | < 0.1% |
| 9 | 2 | < 0.1% |
| Value | Count | Frequency (%) |
| 27 | 1 | < 0.1% |
| 19 | 1 | < 0.1% |
| 15 | 1 | < 0.1% |
| 13 | 1 | < 0.1% |
| 9 | 2 | < 0.1% |
| 8 | 1 | < 0.1% |
| 7 | 6 | < 0.1% |
| 6 | 11 | < 0.1% |
| 5 | 22 | < 0.1% |
| 4 | 58 |
number_of_pedestrians_killed
Real number (ℝ)
HIGH CORRELATION  SKEWED  ZEROS 
| Distinct | 6 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.00075307422 |
| Minimum | 0 |
|---|---|
| Maximum | 6 |
| Zeros | 1900129 |
| Zeros (%) | 99.9% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 29.0 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 0 |
| Maximum | 6 |
| Range | 6 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.028113548 |
|---|---|
| Coefficient of variation (CV) | 37.331709 |
| Kurtosis | 2703.1956 |
| Mean | 0.00075307422 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 42.473402 |
| Sum | 1432 |
| Variance | 0.00079037158 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 1900129 | |
| 1 | 1395 | 0.1% |
| 2 | 12 | < 0.1% |
| 4 | 1 | < 0.1% |
| 6 | 1 | < 0.1% |
| 3 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 0 | 1900129 | |
| 1 | 1395 | 0.1% |
| 2 | 12 | < 0.1% |
| 3 | 1 | < 0.1% |
| 4 | 1 | < 0.1% |
| 6 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 6 | 1 | < 0.1% |
| 4 | 1 | < 0.1% |
| 3 | 1 | < 0.1% |
| 2 | 12 | < 0.1% |
| 1 | 1395 | 0.1% |
| 0 | 1900129 |
number_of_cyclist_injured
Categorical
IMBALANCE 
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 119.7 MiB |
| 0 | |
|---|---|
| 1 | 55075 |
| 2 | 588 |
| 3 | 20 |
| 4 | 1 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 1901539 |
|---|---|
| Distinct characters | 5 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 1 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 1845855 | |
| 1 | 55075 | 2.9% |
| 2 | 588 | < 0.1% |
| 3 | 20 | < 0.1% |
| 4 | 1 | < 0.1% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0 | 1845855 | |
| 1 | 55075 | 2.9% |
| 2 | 588 | < 0.1% |
| 3 | 20 | < 0.1% |
| 4 | 1 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 1845855 | |
| 1 | 55075 | 2.9% |
| 2 | 588 | < 0.1% |
| 3 | 20 | < 0.1% |
| 4 | 1 | < 0.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 1901539 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 1845855 | |
| 1 | 55075 | 2.9% |
| 2 | 588 | < 0.1% |
| 3 | 20 | < 0.1% |
| 4 | 1 | < 0.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 1901539 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 1845855 | |
| 1 | 55075 | 2.9% |
| 2 | 588 | < 0.1% |
| 3 | 20 | < 0.1% |
| 4 | 1 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1901539 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 1845855 | |
| 1 | 55075 | 2.9% |
| 2 | 588 | < 0.1% |
| 3 | 20 | < 0.1% |
| 4 | 1 | < 0.1% |
number_of_cyclist_killed
Categorical
HIGH CORRELATION  IMBALANCE 
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 119.7 MiB |
| 0 | |
|---|---|
| 1 | 232 |
| 2 | 1 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 1901539 |
|---|---|
| Distinct characters | 3 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 1 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 1901306 | |
| 1 | 232 | < 0.1% |
| 2 | 1 | < 0.1% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0 | 1901306 | |
| 1 | 232 | < 0.1% |
| 2 | 1 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 1901306 | |
| 1 | 232 | < 0.1% |
| 2 | 1 | < 0.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 1901539 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 1901306 | |
| 1 | 232 | < 0.1% |
| 2 | 1 | < 0.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 1901539 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 1901306 | |
| 1 | 232 | < 0.1% |
| 2 | 1 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1901539 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 1901306 | |
| 1 | 232 | < 0.1% |
| 2 | 1 | < 0.1% |
number_of_motorist_injured
Real number (ℝ)
HIGH CORRELATION  ZEROS 
| Distinct | 29 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.22875944 |
| Minimum | 0 |
|---|---|
| Maximum | 43 |
| Zeros | 1616538 |
| Zeros (%) | 85.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 29.0 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 1 |
| Maximum | 43 |
| Range | 43 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.66666816 |
|---|---|
| Coefficient of variation (CV) | 2.914276 |
| Kurtosis | 55.395951 |
| Mean | 0.22875944 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 4.9336135 |
| Sum | 434995 |
| Variance | 0.44444644 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 1616538 | |
| 1 | 191733 | 10.1% |
| 2 | 59649 | 3.1% |
| 3 | 20742 | 1.1% |
| 4 | 7692 | 0.4% |
| 5 | 2945 | 0.2% |
| 6 | 1189 | 0.1% |
| 7 | 488 | < 0.1% |
| 8 | 228 | < 0.1% |
| 9 | 106 | < 0.1% |
| Other values (19) | 229 | < 0.1% |
| Value | Count | Frequency (%) |
| 0 | 1616538 | |
| 1 | 191733 | 10.1% |
| 2 | 59649 | 3.1% |
| 3 | 20742 | 1.1% |
| 4 | 7692 | 0.4% |
| 5 | 2945 | 0.2% |
| 6 | 1189 | 0.1% |
| 7 | 488 | < 0.1% |
| 8 | 228 | < 0.1% |
| 9 | 106 | < 0.1% |
| Value | Count | Frequency (%) |
| 43 | 1 | < 0.1% |
| 34 | 1 | < 0.1% |
| 30 | 1 | < 0.1% |
| 25 | 1 | < 0.1% |
| 24 | 3 | |
| 23 | 1 | < 0.1% |
| 22 | 2 | |
| 21 | 1 | < 0.1% |
| 20 | 2 | |
| 19 | 2 |
number_of_motorist_killed
Real number (ℝ)
HIGH CORRELATION  SKEWED  ZEROS 
| Distinct | 6 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.00060161795 |
| Minimum | 0 |
|---|---|
| Maximum | 5 |
| Zeros | 1900485 |
| Zeros (%) | 99.9% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 29.0 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 0 |
| Maximum | 5 |
| Range | 5 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.026854409 |
|---|---|
| Coefficient of variation (CV) | 44.63698 |
| Kurtosis | 4024.7538 |
| Mean | 0.00060161795 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 54.459877 |
| Sum | 1144 |
| Variance | 0.00072115927 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 1900485 | |
| 1 | 983 | 0.1% |
| 2 | 56 | < 0.1% |
| 3 | 12 | < 0.1% |
| 4 | 2 | < 0.1% |
| 5 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 0 | 1900485 | |
| 1 | 983 | 0.1% |
| 2 | 56 | < 0.1% |
| 3 | 12 | < 0.1% |
| 4 | 2 | < 0.1% |
| 5 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 5 | 1 | < 0.1% |
| 4 | 2 | < 0.1% |
| 3 | 12 | < 0.1% |
| 2 | 56 | < 0.1% |
| 1 | 983 | 0.1% |
| 0 | 1900485 |
contributing_factor_vehicle_1
Text
MISSING 
| Distinct | 55 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 642751 |
| Missing (%) | 33.8% |
| Memory size | 131.4 MiB |
Length
| Max length | 53 |
|---|---|
| Median length | 36 |
| Mean length | 24.03544 |
| Min length | 5 |
Characters and Unicode
| Total characters | 30255523 |
|---|---|
| Distinct characters | 30 |
| Distinct categories | 6 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | passing too closely |
|---|---|
| 2nd row | driver inexperience |
| 3rd row | passing too closely |
| 4th row | passing or lane usage improper |
| 5th row | turning improperly |
| Value | Count | Frequency (%) |
| driver | 421819 | 13.3% |
| inattention/distraction | 391028 | 12.3% |
| too | 147831 | 4.7% |
| closely | 147831 | 4.7% |
| to | 138461 | 4.4% |
| failure | 121917 | 3.9% |
| yield | 116233 | 3.7% |
| right-of-way | 116233 | 3.7% |
| passing | 105963 | 3.3% |
| following | 96910 | 3.1% |
| Other values (92) | 1362237 |
Most occurring characters
| Value | Count | Frequency (%) |
| i | 3466096 | |
| t | 2849704 | 9.4% |
| n | 2611557 | 8.6% |
| e | 2523261 | 8.3% |
| r | 2371749 | 7.8% |
| o | 2305062 | 7.6% |
| 1907675 | 6.3% | |
| a | 1896181 | 6.3% |
| s | 1340975 | 4.4% |
| d | 1308800 | 4.3% |
| Other values (20) | 7674463 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 27619565 | |
| Space Separator | 1907675 | 6.3% |
| Other Punctuation | 490159 | 1.6% |
| Dash Punctuation | 234044 | 0.8% |
| Open Punctuation | 2040 | < 0.1% |
| Close Punctuation | 2040 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| i | 3466096 | |
| t | 2849704 | |
| n | 2611557 | |
| e | 2523261 | |
| r | 2371749 | |
| o | 2305062 | |
| a | 1896181 | 6.9% |
| s | 1340975 | 4.9% |
| d | 1308800 | 4.7% |
| l | 1265587 | 4.6% |
| Other values (15) | 5680593 |
Space Separator
| Value | Count | Frequency (%) |
| 1907675 |
Other Punctuation
| Value | Count | Frequency (%) |
| / | 490159 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 234044 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 2040 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 2040 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 27619565 | |
| Common | 2635958 | 8.7% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| i | 3466096 | |
| t | 2849704 | |
| n | 2611557 | |
| e | 2523261 | |
| r | 2371749 | |
| o | 2305062 | |
| a | 1896181 | 6.9% |
| s | 1340975 | 4.9% |
| d | 1308800 | 4.7% |
| l | 1265587 | 4.6% |
| Other values (15) | 5680593 |
Common
| Value | Count | Frequency (%) |
| 1907675 | ||
| / | 490159 | 18.6% |
| - | 234044 | 8.9% |
| ( | 2040 | 0.1% |
| ) | 2040 | 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 30255523 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| i | 3466096 | |
| t | 2849704 | 9.4% |
| n | 2611557 | 8.6% |
| e | 2523261 | 8.3% |
| r | 2371749 | 7.8% |
| o | 2305062 | 7.6% |
| 1907675 | 6.3% | |
| a | 1896181 | 6.3% |
| s | 1340975 | 4.4% |
| d | 1308800 | 4.3% |
| Other values (20) | 7674463 |
contributing_factor_vehicle_2
Text
MISSING 
| Distinct | 55 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 1650130 |
| Missing (%) | 86.8% |
| Memory size | 84.3 MiB |
Length
| Max length | 53 |
|---|---|
| Median length | 43 |
| Mean length | 24.092757 |
| Min length | 5 |
Characters and Unicode
| Total characters | 6057136 |
|---|---|
| Distinct characters | 30 |
| Distinct categories | 6 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | other vehicular |
|---|---|
| 2nd row | driver inattention/distraction |
| 3rd row | driver inattention/distraction |
| 4th row | passing or lane usage improper |
| 5th row | driver inattention/distraction |
| Value | Count | Frequency (%) |
| driver | 93346 | 15.2% |
| inattention/distraction | 87133 | 14.2% |
| other | 30279 | 4.9% |
| vehicular | 29369 | 4.8% |
| too | 24745 | 4.0% |
| closely | 24745 | 4.0% |
| passing | 20071 | 3.3% |
| to | 19525 | 3.2% |
| lane | 18105 | 3.0% |
| following | 16480 | 2.7% |
| Other values (92) | 249686 |
Most occurring characters
| Value | Count | Frequency (%) |
| i | 715076 | |
| t | 609053 | |
| n | 526981 | 8.7% |
| r | 519061 | 8.6% |
| e | 513871 | 8.5% |
| o | 456824 | 7.5% |
| a | 374929 | 6.2% |
| 362075 | 6.0% | |
| d | 272409 | 4.5% |
| s | 264969 | 4.4% |
| Other values (20) | 1441888 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 5553271 | |
| Space Separator | 362075 | 6.0% |
| Other Punctuation | 109409 | 1.8% |
| Dash Punctuation | 31865 | 0.5% |
| Open Punctuation | 258 | < 0.1% |
| Close Punctuation | 258 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| i | 715076 | |
| t | 609053 | |
| n | 526981 | |
| r | 519061 | |
| e | 513871 | |
| o | 456824 | |
| a | 374929 | 6.8% |
| d | 272409 | 4.9% |
| s | 264969 | 4.8% |
| c | 221124 | 4.0% |
| Other values (15) | 1078974 |
Space Separator
| Value | Count | Frequency (%) |
| 362075 |
Other Punctuation
| Value | Count | Frequency (%) |
| / | 109409 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 31865 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 258 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 258 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 5553271 | |
| Common | 503865 | 8.3% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| i | 715076 | |
| t | 609053 | |
| n | 526981 | |
| r | 519061 | |
| e | 513871 | |
| o | 456824 | |
| a | 374929 | 6.8% |
| d | 272409 | 4.9% |
| s | 264969 | 4.8% |
| c | 221124 | 4.0% |
| Other values (15) | 1078974 |
Common
| Value | Count | Frequency (%) |
| 362075 | ||
| / | 109409 | 21.7% |
| - | 31865 | 6.3% |
| ( | 258 | 0.1% |
| ) | 258 | 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 6057136 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| i | 715076 | |
| t | 609053 | |
| n | 526981 | 8.7% |
| r | 519061 | 8.6% |
| e | 513871 | 8.5% |
| o | 456824 | 7.5% |
| a | 374929 | 6.2% |
| 362075 | 6.0% | |
| d | 272409 | 4.5% |
| s | 264969 | 4.4% |
| Other values (20) | 1441888 |
contributing_factor_vehicle_3
Categorical
HIGH CORRELATION  MISSING 
| Distinct | 47 |
|---|---|
| Distinct (%) | 0.5% |
| Missing | 1892805 |
| Missing (%) | 99.5% |
| Memory size | 72.9 MiB |
| other vehicular | |
|---|---|
| driver inattention/distraction | |
| following too closely | |
| fatigued/drowsy | |
| pavement slippery | |
| Other values (42) |
Length
| Max length | 53 |
|---|---|
| Median length | 43 |
| Mean length | 20.67449 |
| Min length | 5 |
Characters and Unicode
| Total characters | 180571 |
|---|---|
| Distinct characters | 30 |
| Distinct categories | 6 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 3 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | passing or lane usage improper |
|---|---|
| 2nd row | following too closely |
| 3rd row | following too closely |
| 4th row | other vehicular |
| 5th row | other vehicular |
Common Values
| Value | Count | Frequency (%) |
| other vehicular | 2625 | 0.1% |
| driver inattention/distraction | 1713 | 0.1% |
| following too closely | 1597 | 0.1% |
| fatigued/drowsy | 624 | < 0.1% |
| pavement slippery | 327 | < 0.1% |
| reaction to uninvolved vehicle | 186 | < 0.1% |
| unsafe speed | 160 | < 0.1% |
| driver inexperience | 156 | < 0.1% |
| outside car distraction | 136 | < 0.1% |
| failure to yield right-of-way | 130 | < 0.1% |
| Other values (37) | 1080 | 0.1% |
| (Missing) | 1892805 |
Length
| Value | Count | Frequency (%) |
| other | 2656 | |
| vehicular | 2625 | |
| driver | 1869 | 9.4% |
| inattention/distraction | 1713 | 8.6% |
| too | 1646 | 8.3% |
| closely | 1646 | 8.3% |
| following | 1597 | 8.0% |
| fatigued/drowsy | 624 | 3.1% |
| to | 343 | 1.7% |
| pavement | 342 | 1.7% |
| Other values (77) | 4833 |
Most occurring characters
| Value | Count | Frequency (%) |
| i | 17510 | 9.7% |
| o | 17367 | 9.6% |
| e | 16793 | 9.3% |
| t | 15958 | 8.8% |
| r | 14439 | 8.0% |
| n | 11581 | 6.4% |
| l | 11261 | 6.2% |
| 11160 | 6.2% | |
| a | 9461 | 5.2% |
| c | 7818 | 4.3% |
| Other values (20) | 47223 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 166502 | |
| Space Separator | 11160 | 6.2% |
| Other Punctuation | 2609 | 1.4% |
| Dash Punctuation | 276 | 0.2% |
| Open Punctuation | 12 | < 0.1% |
| Close Punctuation | 12 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| i | 17510 | |
| o | 17367 | |
| e | 16793 | |
| t | 15958 | |
| r | 14439 | 8.7% |
| n | 11581 | 7.0% |
| l | 11261 | 6.8% |
| a | 9461 | 5.7% |
| c | 7818 | 4.7% |
| d | 6443 | 3.9% |
| Other values (15) | 37871 |
Space Separator
| Value | Count | Frequency (%) |
| 11160 |
Other Punctuation
| Value | Count | Frequency (%) |
| / | 2609 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 276 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 12 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 12 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 166502 | |
| Common | 14069 | 7.8% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| i | 17510 | |
| o | 17367 | |
| e | 16793 | |
| t | 15958 | |
| r | 14439 | 8.7% |
| n | 11581 | 7.0% |
| l | 11261 | 6.8% |
| a | 9461 | 5.7% |
| c | 7818 | 4.7% |
| d | 6443 | 3.9% |
| Other values (15) | 37871 |
Common
| Value | Count | Frequency (%) |
| 11160 | ||
| / | 2609 | 18.5% |
| - | 276 | 2.0% |
| ( | 12 | 0.1% |
| ) | 12 | 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 180571 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| i | 17510 | 9.7% |
| o | 17367 | 9.6% |
| e | 16793 | 9.3% |
| t | 15958 | 8.8% |
| r | 14439 | 8.0% |
| n | 11581 | 6.4% |
| l | 11261 | 6.2% |
| 11160 | 6.2% | |
| a | 9461 | 5.2% |
| c | 7818 | 4.3% |
| Other values (20) | 47223 |
contributing_factor_vehicle_4
Categorical
HIGH CORRELATION  MISSING 
| Distinct | 41 |
|---|---|
| Distinct (%) | 2.5% |
| Missing | 1899878 |
| Missing (%) | 99.9% |
| Memory size | 72.6 MiB |
| other vehicular | |
|---|---|
| following too closely | |
| driver inattention/distraction | |
| fatigued/drowsy | |
| pavement slippery | |
| Other values (36) |
Length
| Max length | 43 |
|---|---|
| Median length | 30 |
| Mean length | 19.568332 |
| Min length | 5 |
Characters and Unicode
| Total characters | 32503 |
|---|---|
| Distinct characters | 30 |
| Distinct categories | 6 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 10 ? |
|---|---|
| Unique (%) | 0.6% |
Sample
| 1st row | other vehicular |
|---|---|
| 2nd row | reaction to uninvolved vehicle |
| 3rd row | other vehicular |
| 4th row | pavement defective |
| 5th row | other vehicular |
Common Values
| Value | Count | Frequency (%) |
| other vehicular | 595 | < 0.1% |
| following too closely | 313 | < 0.1% |
| driver inattention/distraction | 242 | < 0.1% |
| fatigued/drowsy | 124 | < 0.1% |
| pavement slippery | 97 | < 0.1% |
| reaction to uninvolved vehicle | 33 | < 0.1% |
| unsafe speed | 26 | < 0.1% |
| driver inexperience | 26 | < 0.1% |
| outside car distraction | 25 | < 0.1% |
| alcohol involvement | 19 | < 0.1% |
| Other values (31) | 161 | < 0.1% |
| (Missing) | 1899878 |
Length
| Value | Count | Frequency (%) |
| other | 600 | |
| vehicular | 595 | |
| closely | 317 | |
| too | 317 | |
| following | 313 | |
| driver | 268 | 7.2% |
| inattention/distraction | 242 | 6.5% |
| fatigued/drowsy | 124 | 3.3% |
| pavement | 101 | 2.7% |
| slippery | 97 | 2.6% |
| Other values (66) | 751 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 3185 | 9.8% |
| o | 3177 | 9.8% |
| i | 2902 | 8.9% |
| t | 2675 | 8.2% |
| r | 2592 | 8.0% |
| l | 2258 | 6.9% |
| 2064 | 6.4% | |
| n | 1782 | 5.5% |
| a | 1673 | 5.1% |
| c | 1431 | 4.4% |
| Other values (20) | 8764 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 29996 | |
| Space Separator | 2064 | 6.4% |
| Other Punctuation | 401 | 1.2% |
| Dash Punctuation | 34 | 0.1% |
| Open Punctuation | 4 | < 0.1% |
| Close Punctuation | 4 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 3185 | |
| o | 3177 | |
| i | 2902 | |
| t | 2675 | 8.9% |
| r | 2592 | 8.6% |
| l | 2258 | 7.5% |
| n | 1782 | 5.9% |
| a | 1673 | 5.6% |
| c | 1431 | 4.8% |
| h | 1283 | 4.3% |
| Other values (15) | 7038 |
Space Separator
| Value | Count | Frequency (%) |
| 2064 |
Other Punctuation
| Value | Count | Frequency (%) |
| / | 401 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 34 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 4 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 4 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 29996 | |
| Common | 2507 | 7.7% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 3185 | |
| o | 3177 | |
| i | 2902 | |
| t | 2675 | 8.9% |
| r | 2592 | 8.6% |
| l | 2258 | 7.5% |
| n | 1782 | 5.9% |
| a | 1673 | 5.6% |
| c | 1431 | 4.8% |
| h | 1283 | 4.3% |
| Other values (15) | 7038 |
Common
| Value | Count | Frequency (%) |
| 2064 | ||
| / | 401 | 16.0% |
| - | 34 | 1.4% |
| ( | 4 | 0.2% |
| ) | 4 | 0.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 32503 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 3185 | 9.8% |
| o | 3177 | 9.8% |
| i | 2902 | 8.9% |
| t | 2675 | 8.2% |
| r | 2592 | 8.0% |
| l | 2258 | 6.9% |
| 2064 | 6.4% | |
| n | 1782 | 5.5% |
| a | 1673 | 5.1% |
| c | 1431 | 4.4% |
| Other values (20) | 8764 |
contributing_factor_vehicle_5
Categorical
HIGH CORRELATION  MISSING 
| Distinct | 29 |
|---|---|
| Distinct (%) | 6.3% |
| Missing | 1901079 |
| Missing (%) | > 99.9% |
| Memory size | 72.6 MiB |
| other vehicular | |
|---|---|
| following too closely | |
| driver inattention/distraction | |
| pavement slippery | |
| fatigued/drowsy | |
| Other values (24) |
Length
| Max length | 43 |
|---|---|
| Median length | 30 |
| Mean length | 18.86087 |
| Min length | 5 |
Characters and Unicode
| Total characters | 8676 |
|---|---|
| Distinct characters | 29 |
| Distinct categories | 6 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 11 ? |
|---|---|
| Unique (%) | 2.4% |
Sample
| 1st row | other vehicular |
|---|---|
| 2nd row | other vehicular |
| 3rd row | pavement slippery |
| 4th row | pavement slippery |
| 5th row | following too closely |
Common Values
| Value | Count | Frequency (%) |
| other vehicular | 175 | < 0.1% |
| following too closely | 78 | < 0.1% |
| driver inattention/distraction | 53 | < 0.1% |
| pavement slippery | 44 | < 0.1% |
| fatigued/drowsy | 29 | < 0.1% |
| alcohol involvement | 11 | < 0.1% |
| obstruction/debris | 10 | < 0.1% |
| reaction to uninvolved vehicle | 9 | < 0.1% |
| unsafe speed | 9 | < 0.1% |
| driver inexperience | 8 | < 0.1% |
| Other values (19) | 34 | < 0.1% |
| (Missing) | 1901079 |
Length
| Value | Count | Frequency (%) |
| other | 176 | |
| vehicular | 175 | |
| too | 80 | |
| closely | 80 | |
| following | 78 | |
| driver | 61 | 6.1% |
| inattention/distraction | 53 | 5.3% |
| pavement | 45 | 4.5% |
| slippery | 44 | 4.4% |
| fatigued/drowsy | 29 | 2.9% |
| Other values (46) | 185 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 907 | 10.5% |
| o | 825 | 9.5% |
| i | 738 | 8.5% |
| r | 694 | 8.0% |
| t | 686 | 7.9% |
| l | 623 | 7.2% |
| 546 | 6.3% | |
| n | 442 | 5.1% |
| a | 432 | 5.0% |
| c | 383 | 4.4% |
| Other values (19) | 2400 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 8022 | |
| Space Separator | 546 | 6.3% |
| Other Punctuation | 95 | 1.1% |
| Dash Punctuation | 9 | 0.1% |
| Open Punctuation | 2 | < 0.1% |
| Close Punctuation | 2 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 907 | |
| o | 825 | |
| i | 738 | 9.2% |
| r | 694 | 8.7% |
| t | 686 | 8.6% |
| l | 623 | 7.8% |
| n | 442 | 5.5% |
| a | 432 | 5.4% |
| c | 383 | 4.8% |
| h | 380 | 4.7% |
| Other values (14) | 1912 |
Space Separator
| Value | Count | Frequency (%) |
| 546 |
Other Punctuation
| Value | Count | Frequency (%) |
| / | 95 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 9 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 2 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 2 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 8022 | |
| Common | 654 | 7.5% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 907 | |
| o | 825 | |
| i | 738 | 9.2% |
| r | 694 | 8.7% |
| t | 686 | 8.6% |
| l | 623 | 7.8% |
| n | 442 | 5.5% |
| a | 432 | 5.4% |
| c | 383 | 4.8% |
| h | 380 | 4.7% |
| Other values (14) | 1912 |
Common
| Value | Count | Frequency (%) |
| 546 | ||
| / | 95 | 14.5% |
| - | 9 | 1.4% |
| ( | 2 | 0.3% |
| ) | 2 | 0.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 8676 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 907 | 10.5% |
| o | 825 | 9.5% |
| i | 738 | 8.5% |
| r | 694 | 8.0% |
| t | 686 | 7.9% |
| l | 623 | 7.2% |
| 546 | 6.3% | |
| n | 442 | 5.1% |
| a | 432 | 5.0% |
| c | 383 | 4.4% |
| Other values (19) | 2400 |
collision_id
Real number (ℝ)
HIGH CORRELATION  UNIQUE 
| Distinct | 1901539 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3256245.3 |
| Minimum | 22 |
|---|---|
| Maximum | 4806433 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 29.0 MiB |
Quantile statistics
| Minimum | 22 |
|---|---|
| 5-th percentile | 105069.8 |
| Q1 | 3202737.5 |
| median | 3757207 |
| Q3 | 4278186.5 |
| 95-th percentile | 4701111.2 |
| Maximum | 4806433 |
| Range | 4806411 |
| Interquartile range (IQR) | 1075449 |
Descriptive statistics
| Standard deviation | 1506534.6 |
|---|---|
| Coefficient of variation (CV) | 0.46266004 |
| Kurtosis | 0.17216815 |
| Mean | 3256245.3 |
| Median Absolute Deviation (MAD) | 537124 |
| Skewness | -1.2861947 |
| Sum | 6.1918775 × 1012 |
| Variance | 2.2696464 × 1012 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 4675373 | 1 | < 0.1% |
| 3318640 | 1 | < 0.1% |
| 3324606 | 1 | < 0.1% |
| 3325472 | 1 | < 0.1% |
| 3311096 | 1 | < 0.1% |
| 3322468 | 1 | < 0.1% |
| 3315127 | 1 | < 0.1% |
| 3316449 | 1 | < 0.1% |
| 3323540 | 1 | < 0.1% |
| 3315524 | 1 | < 0.1% |
| Other values (1901529) | 1901529 |
| Value | Count | Frequency (%) |
| 22 | 1 | |
| 23 | 1 | |
| 25 | 1 | |
| 26 | 1 | |
| 27 | 1 | |
| 28 | 1 | |
| 29 | 1 | |
| 30 | 1 | |
| 31 | 1 | |
| 32 | 1 |
| Value | Count | Frequency (%) |
| 4806433 | 1 | |
| 4806432 | 1 | |
| 4806429 | 1 | |
| 4806428 | 1 | |
| 4806425 | 1 | |
| 4806423 | 1 | |
| 4806422 | 1 | |
| 4806409 | 1 | |
| 4806408 | 1 | |
| 4806407 | 1 |
| Distinct | 1645 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 148.3 MiB |
Length
| Max length | 38 |
|---|---|
| Median length | 35 |
| Mean length | 16.765991 |
| Min length | 1 |
Characters and Unicode
| Total characters | 31881185 |
|---|---|
| Distinct characters | 77 |
| Distinct categories | 11 ? |
| Distinct scripts | 3 ? |
| Distinct blocks | 3 ? |
Unique
| Unique | 1001 ? |
|---|---|
| Unique (%) | 0.1% |
Sample
| 1st row | Moped |
|---|---|
| 2nd row | Sedan |
| 3rd row | Sedan |
| 4th row | Sedan |
| 5th row | Sedan |
| Value | Count | Frequency (%) |
| vehicle | 797022 | |
| sedan | 598331 | |
| utility | 591945 | |
| station | 591907 | |
| wagon/sport | 441234 | |
| passenger | 346612 | |
| 151982 | 3.4% | |
| wagon | 150726 | 3.4% |
| sport | 150672 | 3.4% |
| truck | 80835 | 1.8% |
| Other values (946) | 542047 |
Most occurring characters
| Value | Count | Frequency (%) |
| 2553229 | 8.0% | |
| S | 2496024 | 7.8% |
| t | 2238694 | 7.0% |
| i | 1885086 | 5.9% |
| a | 1571124 | 4.9% |
| e | 1564629 | 4.9% |
| E | 1517050 | 4.8% |
| n | 1502079 | 4.7% |
| o | 1398623 | 4.4% |
| T | 974139 | 3.1% |
| Other values (67) | 14180508 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 15149372 | |
| Uppercase Letter | 13436109 | |
| Space Separator | 2553229 | 8.0% |
| Other Punctuation | 593269 | 1.9% |
| Decimal Number | 53740 | 0.2% |
| Dash Punctuation | 49760 | 0.2% |
| Open Punctuation | 22853 | 0.1% |
| Close Punctuation | 22849 | 0.1% |
| Modifier Symbol | 2 | < 0.1% |
| Control | 1 | < 0.1% |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| S | 2496024 | |
| E | 1517050 | |
| T | 974139 | 7.3% |
| I | 881523 | 6.6% |
| V | 860346 | 6.4% |
| A | 735544 | 5.5% |
| N | 724759 | 5.4% |
| U | 645473 | 4.8% |
| W | 610751 | 4.5% |
| R | 603278 | 4.5% |
| Other values (18) | 3387222 |
Lowercase Letter
| Value | Count | Frequency (%) |
| t | 2238694 | |
| i | 1885086 | |
| a | 1571124 | |
| e | 1564629 | |
| n | 1502079 | |
| o | 1398623 | |
| l | 917798 | |
| d | 634765 | 4.2% |
| r | 595949 | 3.9% |
| c | 583453 | 3.9% |
| Other values (15) | 2257172 |
Decimal Number
| Value | Count | Frequency (%) |
| 4 | 39939 | |
| 6 | 11407 | 21.2% |
| 2 | 1887 | 3.5% |
| 3 | 337 | 0.6% |
| 5 | 49 | 0.1% |
| 0 | 45 | 0.1% |
| 1 | 43 | 0.1% |
| 9 | 16 | < 0.1% |
| 8 | 10 | < 0.1% |
| 7 | 7 | < 0.1% |
Other Punctuation
| Value | Count | Frequency (%) |
| / | 593239 | |
| . | 15 | < 0.1% |
| # | 8 | < 0.1% |
| , | 3 | < 0.1% |
| ' | 2 | < 0.1% |
| ? | 1 | < 0.1% |
| & | 1 | < 0.1% |
Space Separator
| Value | Count | Frequency (%) |
| 2553229 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 49760 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 22853 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 22849 |
Modifier Symbol
| Value | Count | Frequency (%) |
| ` | 2 |
Control
| Value | Count | Frequency (%) |
| | 1 |
Other Symbol
| Value | Count | Frequency (%) |
| � | 1 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 28585479 | |
| Common | 3295704 | 10.3% |
| Cyrillic | 2 | < 0.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| S | 2496024 | 8.7% |
| t | 2238694 | 7.8% |
| i | 1885086 | 6.6% |
| a | 1571124 | 5.5% |
| e | 1564629 | 5.5% |
| E | 1517050 | 5.3% |
| n | 1502079 | 5.3% |
| o | 1398623 | 4.9% |
| T | 974139 | 3.4% |
| l | 917798 | 3.2% |
| Other values (41) | 12520233 |
Common
| Value | Count | Frequency (%) |
| 2553229 | ||
| / | 593239 | 18.0% |
| - | 49760 | 1.5% |
| 4 | 39939 | 1.2% |
| ( | 22853 | 0.7% |
| ) | 22849 | 0.7% |
| 6 | 11407 | 0.3% |
| 2 | 1887 | 0.1% |
| 3 | 337 | < 0.1% |
| 5 | 49 | < 0.1% |
| Other values (14) | 155 | < 0.1% |
Cyrillic
| Value | Count | Frequency (%) |
| Х | 1 | |
| Ð | 1 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 31881182 | |
| Cyrillic | 2 | < 0.1% |
| Specials | 1 | < 0.1% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 2553229 | 8.0% | |
| S | 2496024 | 7.8% |
| t | 2238694 | 7.0% |
| i | 1885086 | 5.9% |
| a | 1571124 | 4.9% |
| e | 1564629 | 4.9% |
| E | 1517050 | 4.8% |
| n | 1502079 | 4.7% |
| o | 1398623 | 4.4% |
| T | 974139 | 3.1% |
| Other values (64) | 14180505 |
Cyrillic
| Value | Count | Frequency (%) |
| Х | 1 | |
| Ð | 1 |
Specials
| Value | Count | Frequency (%) |
| � | 1 |
MISSING 
| Distinct | 1855 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 375446 |
| Missing (%) | 19.7% |
| Memory size | 132.1 MiB |
Length
| Max length | 38 |
|---|---|
| Median length | 30 |
| Mean length | 15.946613 |
| Min length | 1 |
Characters and Unicode
| Total characters | 24336014 |
|---|---|
| Distinct characters | 73 |
| Distinct categories | 9 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 1113 ? |
|---|---|
| Unique (%) | 0.1% |
Sample
| 1st row | Sedan |
|---|---|
| 2nd row | Tractor Truck Diesel |
| 3rd row | Sedan |
| 4th row | Station Wagon/Sport Utility Vehicle |
| 5th row | Station Wagon/Sport Utility Vehicle |
| Value | Count | Frequency (%) |
| vehicle | 582151 | |
| utility | 428255 | |
| station | 428225 | |
| sedan | 411595 | |
| wagon/sport | 311493 | |
| passenger | 263227 | |
| 118016 | 3.4% | |
| wagon | 116786 | 3.4% |
| sport | 116732 | 3.4% |
| truck | 80192 | 2.3% |
| Other values (1018) | 579885 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1921523 | 7.9% | |
| S | 1821640 | 7.5% |
| t | 1590468 | 6.5% |
| i | 1368749 | 5.6% |
| E | 1193894 | 4.9% |
| e | 1137927 | 4.7% |
| a | 1107523 | 4.6% |
| n | 1052638 | 4.3% |
| o | 1016075 | 4.2% |
| T | 782475 | 3.2% |
| Other values (63) | 11343102 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 10993853 | |
| Uppercase Letter | 10852319 | |
| Space Separator | 1921523 | 7.9% |
| Other Punctuation | 429579 | 1.8% |
| Dash Punctuation | 50558 | 0.2% |
| Decimal Number | 44577 | 0.2% |
| Open Punctuation | 21803 | 0.1% |
| Close Punctuation | 21800 | 0.1% |
| Modifier Symbol | 2 | < 0.1% |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| S | 1821640 | |
| E | 1193894 | |
| T | 782475 | 7.2% |
| N | 728771 | 6.7% |
| I | 703420 | 6.5% |
| V | 639714 | 5.9% |
| A | 572420 | 5.3% |
| U | 529858 | 4.9% |
| W | 499804 | 4.6% |
| O | 490852 | 4.5% |
| Other values (16) | 2889471 |
Lowercase Letter
| Value | Count | Frequency (%) |
| t | 1590468 | |
| i | 1368749 | |
| e | 1137927 | |
| a | 1107523 | |
| n | 1052638 | |
| o | 1016075 | |
| l | 654224 | 6.0% |
| r | 458703 | 4.2% |
| c | 448371 | 4.1% |
| d | 441339 | 4.0% |
| Other values (15) | 1717836 |
Decimal Number
| Value | Count | Frequency (%) |
| 4 | 32019 | |
| 6 | 10753 | 24.1% |
| 2 | 1353 | 3.0% |
| 3 | 314 | 0.7% |
| 0 | 61 | 0.1% |
| 1 | 30 | 0.1% |
| 5 | 27 | 0.1% |
| 9 | 8 | < 0.1% |
| 8 | 7 | < 0.1% |
| 7 | 5 | < 0.1% |
Other Punctuation
| Value | Count | Frequency (%) |
| / | 429558 | |
| . | 11 | < 0.1% |
| , | 3 | < 0.1% |
| ' | 3 | < 0.1% |
| ? | 2 | < 0.1% |
| # | 1 | < 0.1% |
| & | 1 | < 0.1% |
Space Separator
| Value | Count | Frequency (%) |
| 1921523 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 50558 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 21803 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 21800 |
Modifier Symbol
| Value | Count | Frequency (%) |
| ` | 2 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 21846172 | |
| Common | 2489842 | 10.2% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| S | 1821640 | 8.3% |
| t | 1590468 | 7.3% |
| i | 1368749 | 6.3% |
| E | 1193894 | 5.5% |
| e | 1137927 | 5.2% |
| a | 1107523 | 5.1% |
| n | 1052638 | 4.8% |
| o | 1016075 | 4.7% |
| T | 782475 | 3.6% |
| N | 728771 | 3.3% |
| Other values (41) | 10046012 |
Common
| Value | Count | Frequency (%) |
| 1921523 | ||
| / | 429558 | 17.3% |
| - | 50558 | 2.0% |
| 4 | 32019 | 1.3% |
| ( | 21803 | 0.9% |
| ) | 21800 | 0.9% |
| 6 | 10753 | 0.4% |
| 2 | 1353 | 0.1% |
| 3 | 314 | < 0.1% |
| 0 | 61 | < 0.1% |
| Other values (12) | 100 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 24336014 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1921523 | 7.9% | |
| S | 1821640 | 7.5% |
| t | 1590468 | 6.5% |
| i | 1368749 | 5.6% |
| E | 1193894 | 4.9% |
| e | 1137927 | 4.7% |
| a | 1107523 | 4.6% |
| n | 1052638 | 4.3% |
| o | 1016075 | 4.2% |
| T | 782475 | 3.2% |
| Other values (63) | 11343102 |
MISSING 
| Distinct | 270 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 1770054 |
| Missing (%) | 93.1% |
| Memory size | 77.9 MiB |
Length
| Max length | 35 |
|---|---|
| Median length | 30 |
| Mean length | 17.694072 |
| Min length | 2 |
Characters and Unicode
| Total characters | 2326505 |
|---|---|
| Distinct characters | 61 |
| Distinct categories | 8 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 162 ? |
|---|---|
| Unique (%) | 0.1% |
Sample
| 1st row | Sedan |
|---|---|
| 2nd row | Sedan |
| 3rd row | Station Wagon/Sport Utility Vehicle |
| 4th row | Sedan |
| 5th row | Sedan |
| Value | Count | Frequency (%) |
| vehicle | 58769 | |
| utility | 46494 | |
| station | 46491 | |
| sedan | 45251 | |
| wagon/sport | 35242 | |
| passenger | 23184 | 7.3% |
| 11326 | 3.6% | |
| wagon | 11249 | 3.5% |
| sport | 11248 | 3.5% |
| truck | 4087 | 1.3% |
| Other values (218) | 23886 |
Most occurring characters
| Value | Count | Frequency (%) |
| 186091 | 8.0% | |
| S | 184164 | 7.9% |
| t | 177583 | 7.6% |
| i | 146587 | 6.3% |
| a | 119104 | 5.1% |
| e | 118814 | 5.1% |
| n | 116576 | 5.0% |
| o | 108677 | 4.7% |
| E | 97137 | 4.2% |
| l | 71827 | 3.1% |
| Other values (51) | 999945 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 1156019 | |
| Uppercase Letter | 931012 | |
| Space Separator | 186091 | 8.0% |
| Other Punctuation | 46570 | 2.0% |
| Dash Punctuation | 2898 | 0.1% |
| Decimal Number | 2589 | 0.1% |
| Open Punctuation | 663 | < 0.1% |
| Close Punctuation | 663 | < 0.1% |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| S | 184164 | |
| E | 97137 | |
| T | 65478 | 7.0% |
| V | 61299 | 6.6% |
| I | 59870 | 6.4% |
| N | 54891 | 5.9% |
| U | 50632 | 5.4% |
| W | 49266 | 5.3% |
| A | 48570 | 5.2% |
| O | 38977 | 4.2% |
| Other values (15) | 220728 |
Lowercase Letter
| Value | Count | Frequency (%) |
| t | 177583 | |
| i | 146587 | |
| a | 119104 | |
| e | 118814 | |
| n | 116576 | |
| o | 108677 | |
| l | 71827 | |
| d | 47391 | 4.1% |
| r | 42798 | 3.7% |
| c | 42477 | 3.7% |
| Other values (14) | 164185 |
Decimal Number
| Value | Count | Frequency (%) |
| 4 | 2124 | |
| 6 | 315 | 12.2% |
| 2 | 134 | 5.2% |
| 3 | 12 | 0.5% |
| 8 | 2 | 0.1% |
| 5 | 1 | < 0.1% |
| 0 | 1 | < 0.1% |
Space Separator
| Value | Count | Frequency (%) |
| 186091 |
Other Punctuation
| Value | Count | Frequency (%) |
| / | 46570 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 2898 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 663 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 663 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 2087031 | |
| Common | 239474 | 10.3% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| S | 184164 | 8.8% |
| t | 177583 | 8.5% |
| i | 146587 | 7.0% |
| a | 119104 | 5.7% |
| e | 118814 | 5.7% |
| n | 116576 | 5.6% |
| o | 108677 | 5.2% |
| E | 97137 | 4.7% |
| l | 71827 | 3.4% |
| T | 65478 | 3.1% |
| Other values (39) | 881084 |
Common
| Value | Count | Frequency (%) |
| 186091 | ||
| / | 46570 | 19.4% |
| - | 2898 | 1.2% |
| 4 | 2124 | 0.9% |
| ( | 663 | 0.3% |
| ) | 663 | 0.3% |
| 6 | 315 | 0.1% |
| 2 | 134 | 0.1% |
| 3 | 12 | < 0.1% |
| 8 | 2 | < 0.1% |
| Other values (2) | 2 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 2326505 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 186091 | 8.0% | |
| S | 184164 | 7.9% |
| t | 177583 | 7.6% |
| i | 146587 | 6.3% |
| a | 119104 | 5.1% |
| e | 118814 | 5.1% |
| n | 116576 | 5.0% |
| o | 108677 | 4.7% |
| E | 97137 | 4.2% |
| l | 71827 | 3.1% |
| Other values (51) | 999945 |
MISSING 
| Distinct | 102 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 1871192 |
| Missing (%) | 98.4% |
| Memory size | 73.8 MiB |
Length
| Max length | 35 |
|---|---|
| Median length | 30 |
| Mean length | 18.059248 |
| Min length | 2 |
Characters and Unicode
| Total characters | 548044 |
|---|---|
| Distinct characters | 57 |
| Distinct categories | 8 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 50 ? |
|---|---|
| Unique (%) | 0.2% |
Sample
| 1st row | Station Wagon/Sport Utility Vehicle |
|---|---|
| 2nd row | Sedan |
| 3rd row | Station Wagon/Sport Utility Vehicle |
| 4th row | Sedan |
| 5th row | Sedan |
| Value | Count | Frequency (%) |
| vehicle | 13988 | |
| station | 11309 | |
| utility | 11309 | |
| sedan | 11118 | |
| wagon/sport | 8890 | |
| passenger | 5057 | 6.8% |
| 2428 | 3.3% | |
| sport | 2419 | 3.3% |
| wagon | 2419 | 3.3% |
| truck | 749 | 1.0% |
| Other values (101) | 4143 | 5.6% |
Most occurring characters
| Value | Count | Frequency (%) |
| t | 44661 | 8.1% |
| S | 43628 | 8.0% |
| 43525 | 7.9% | |
| i | 36614 | 6.7% |
| a | 29521 | 5.4% |
| e | 29357 | 5.4% |
| n | 29033 | 5.3% |
| o | 27143 | 5.0% |
| E | 20832 | 3.8% |
| l | 18008 | 3.3% |
| Other values (47) | 225722 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 286471 | |
| Uppercase Letter | 205493 | |
| Space Separator | 43525 | 7.9% |
| Other Punctuation | 11318 | 2.1% |
| Dash Punctuation | 583 | 0.1% |
| Decimal Number | 490 | 0.1% |
| Close Punctuation | 82 | < 0.1% |
| Open Punctuation | 82 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| t | 44661 | |
| i | 36614 | |
| a | 29521 | |
| e | 29357 | |
| n | 29033 | |
| o | 27143 | |
| l | 18008 | |
| d | 11559 | 4.0% |
| r | 10303 | 3.6% |
| c | 10258 | 3.6% |
| Other values (14) | 40014 |
Uppercase Letter
| Value | Count | Frequency (%) |
| S | 43628 | |
| E | 20832 | |
| V | 14390 | 7.0% |
| T | 13729 | 6.7% |
| I | 12712 | 6.2% |
| U | 12021 | 5.8% |
| W | 11803 | 5.7% |
| N | 11561 | 5.6% |
| A | 10337 | 5.0% |
| P | 8142 | 4.0% |
| Other values (14) | 46338 |
Decimal Number
| Value | Count | Frequency (%) |
| 4 | 415 | |
| 6 | 39 | 8.0% |
| 2 | 35 | 7.1% |
| 3 | 1 | 0.2% |
Space Separator
| Value | Count | Frequency (%) |
| 43525 |
Other Punctuation
| Value | Count | Frequency (%) |
| / | 11318 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 583 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 82 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 82 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 491964 | |
| Common | 56080 | 10.2% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| t | 44661 | 9.1% |
| S | 43628 | 8.9% |
| i | 36614 | 7.4% |
| a | 29521 | 6.0% |
| e | 29357 | 6.0% |
| n | 29033 | 5.9% |
| o | 27143 | 5.5% |
| E | 20832 | 4.2% |
| l | 18008 | 3.7% |
| V | 14390 | 2.9% |
| Other values (38) | 198777 |
Common
| Value | Count | Frequency (%) |
| 43525 | ||
| / | 11318 | 20.2% |
| - | 583 | 1.0% |
| 4 | 415 | 0.7% |
| ) | 82 | 0.1% |
| ( | 82 | 0.1% |
| 6 | 39 | 0.1% |
| 2 | 35 | 0.1% |
| 3 | 1 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 548044 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| t | 44661 | 8.1% |
| S | 43628 | 8.0% |
| 43525 | 7.9% | |
| i | 36614 | 6.7% |
| a | 29521 | 5.4% |
| e | 29357 | 5.4% |
| n | 29033 | 5.3% |
| o | 27143 | 5.0% |
| E | 20832 | 3.8% |
| l | 18008 | 3.3% |
| Other values (47) | 225722 |
MISSING 
| Distinct | 70 |
|---|---|
| Distinct (%) | 0.8% |
| Missing | 1893051 |
| Missing (%) | 99.6% |
| Memory size | 72.9 MiB |
Length
| Max length | 35 |
|---|---|
| Median length | 30 |
| Mean length | 18.304665 |
| Min length | 2 |
Characters and Unicode
| Total characters | 155370 |
|---|---|
| Distinct characters | 55 |
| Distinct categories | 8 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 34 ? |
|---|---|
| Unique (%) | 0.4% |
Sample
| 1st row | Station Wagon/Sport Utility Vehicle |
|---|---|
| 2nd row | Station Wagon/Sport Utility Vehicle |
| 3rd row | Sedan |
| 4th row | Sedan |
| 5th row | Station Wagon/Sport Utility Vehicle |
| Value | Count | Frequency (%) |
| vehicle | 3871 | |
| station | 3300 | |
| utility | 3300 | |
| sedan | 3178 | |
| wagon/sport | 2595 | |
| passenger | 1269 | 6.1% |
| wagon | 707 | 3.4% |
| 706 | 3.4% | |
| sport | 705 | 3.4% |
| truck | 248 | 1.2% |
| Other values (71) | 1014 | 4.9% |
Most occurring characters
| Value | Count | Frequency (%) |
| t | 13047 | 8.4% |
| 12411 | 8.0% | |
| S | 12260 | 7.9% |
| i | 10695 | 6.9% |
| a | 8538 | 5.5% |
| e | 8497 | 5.5% |
| n | 8421 | 5.4% |
| o | 7947 | 5.1% |
| l | 5257 | 3.4% |
| E | 5206 | 3.4% |
| Other values (45) | 63091 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 83473 | |
| Uppercase Letter | 55856 | |
| Space Separator | 12411 | 8.0% |
| Other Punctuation | 3301 | 2.1% |
| Dash Punctuation | 195 | 0.1% |
| Decimal Number | 108 | 0.1% |
| Open Punctuation | 13 | < 0.1% |
| Close Punctuation | 13 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| t | 13047 | |
| i | 10695 | |
| a | 8538 | |
| e | 8497 | |
| n | 8421 | |
| o | 7947 | |
| l | 5257 | |
| d | 3283 | 3.9% |
| c | 3061 | 3.7% |
| r | 3007 | 3.6% |
| Other values (14) | 11720 |
Uppercase Letter
| Value | Count | Frequency (%) |
| S | 12260 | |
| E | 5206 | |
| T | 3996 | 7.2% |
| V | 3964 | 7.1% |
| I | 3477 | 6.2% |
| U | 3444 | 6.2% |
| W | 3377 | 6.0% |
| N | 2949 | 5.3% |
| A | 2783 | 5.0% |
| O | 2278 | 4.1% |
| Other values (13) | 12122 |
Decimal Number
| Value | Count | Frequency (%) |
| 4 | 92 | |
| 2 | 9 | 8.3% |
| 6 | 7 | 6.5% |
Space Separator
| Value | Count | Frequency (%) |
| 12411 |
Other Punctuation
| Value | Count | Frequency (%) |
| / | 3301 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 195 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 13 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 13 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 139329 | |
| Common | 16041 | 10.3% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| t | 13047 | 9.4% |
| S | 12260 | 8.8% |
| i | 10695 | 7.7% |
| a | 8538 | 6.1% |
| e | 8497 | 6.1% |
| n | 8421 | 6.0% |
| o | 7947 | 5.7% |
| l | 5257 | 3.8% |
| E | 5206 | 3.7% |
| T | 3996 | 2.9% |
| Other values (37) | 55465 |
Common
| Value | Count | Frequency (%) |
| 12411 | ||
| / | 3301 | 20.6% |
| - | 195 | 1.2% |
| 4 | 92 | 0.6% |
| ( | 13 | 0.1% |
| ) | 13 | 0.1% |
| 2 | 9 | 0.1% |
| 6 | 7 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 155370 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| t | 13047 | 8.4% |
| 12411 | 8.0% | |
| S | 12260 | 7.9% |
| i | 10695 | 6.9% |
| a | 8538 | 5.5% |
| e | 8497 | 5.5% |
| n | 8421 | 5.4% |
| o | 7947 | 5.1% |
| l | 5257 | 3.4% |
| E | 5206 | 3.4% |
| Other values (45) | 63091 |
crash_hour
Real number (ℝ)
ZEROS 
| Distinct | 24 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 13.181395 |
| Minimum | 0 |
|---|---|
| Maximum | 23 |
| Zeros | 62782 |
| Zeros (%) | 3.3% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 29.0 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 9 |
| median | 14 |
| Q3 | 18 |
| 95-th percentile | 22 |
| Maximum | 23 |
| Range | 23 |
| Interquartile range (IQR) | 9 |
Descriptive statistics
| Standard deviation | 5.7787509 |
|---|---|
| Coefficient of variation (CV) | 0.43840209 |
| Kurtosis | -0.44228475 |
| Mean | 13.181395 |
| Median Absolute Deviation (MAD) | 4 |
| Skewness | -0.43622471 |
| Sum | 25064936 |
| Variance | 33.393963 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 16 | 135576 | 7.1% |
| 17 | 132832 | 7.0% |
| 14 | 126151 | 6.6% |
| 15 | 118366 | 6.2% |
| 18 | 116843 | 6.1% |
| 13 | 109281 | 5.7% |
| 8 | 104431 | 5.5% |
| 12 | 104328 | 5.5% |
| 9 | 100246 | 5.3% |
| 11 | 98034 | 5.2% |
| Other values (14) | 755451 |
| Value | Count | Frequency (%) |
| 0 | 62782 | |
| 1 | 33564 | 1.8% |
| 2 | 25882 | 1.4% |
| 3 | 22807 | 1.2% |
| 4 | 25747 | 1.4% |
| 5 | 27729 | 1.5% |
| 6 | 42341 | |
| 7 | 58147 | |
| 8 | 104431 | |
| 9 | 100246 |
| Value | Count | Frequency (%) |
| 23 | 52838 | 2.8% |
| 22 | 62880 | |
| 21 | 69162 | |
| 20 | 81066 | |
| 19 | 96700 | |
| 18 | 116843 | |
| 17 | 132832 | |
| 16 | 135576 | |
| 15 | 118366 | |
| 14 | 126151 |
crash_day
Categorical
| Distinct | 7 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 130.8 MiB |
| Friday | |
|---|---|
| Thursday | |
| Tuesday | |
| Wednesday | |
| Monday | |
| Other values (2) |
Length
| Max length | 9 |
|---|---|
| Median length | 8 |
| Mean length | 7.1523682 |
| Min length | 6 |
Characters and Unicode
| Total characters | 13600507 |
|---|---|
| Distinct characters | 17 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Wednesday |
|---|---|
| 2nd row | Saturday |
| 3rd row | Tuesday |
| 4th row | Tuesday |
| 5th row | Tuesday |
Common Values
| Value | Count | Frequency (%) |
| Friday | 303250 | |
| Thursday | 283437 | |
| Tuesday | 279745 | |
| Wednesday | 276856 | |
| Monday | 271784 | |
| Saturday | 257043 | |
| Sunday | 229424 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| friday | 303250 | |
| thursday | 283437 | |
| tuesday | 279745 | |
| wednesday | 276856 | |
| monday | 271784 | |
| saturday | 257043 | |
| sunday | 229424 |
Most occurring characters
| Value | Count | Frequency (%) |
| d | 2178395 | |
| a | 2158582 | |
| y | 1901539 | |
| u | 1049649 | |
| r | 843730 | 6.2% |
| s | 840038 | 6.2% |
| e | 833457 | 6.1% |
| n | 778064 | 5.7% |
| T | 563182 | 4.1% |
| S | 486467 | 3.6% |
| Other values (7) | 1967404 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 11698968 | |
| Uppercase Letter | 1901539 | 14.0% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| d | 2178395 | |
| a | 2158582 | |
| y | 1901539 | |
| u | 1049649 | |
| r | 843730 | 7.2% |
| s | 840038 | 7.2% |
| e | 833457 | 7.1% |
| n | 778064 | 6.7% |
| i | 303250 | 2.6% |
| h | 283437 | 2.4% |
| Other values (2) | 528827 | 4.5% |
Uppercase Letter
| Value | Count | Frequency (%) |
| T | 563182 | |
| S | 486467 | |
| F | 303250 | |
| W | 276856 | |
| M | 271784 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 13600507 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| d | 2178395 | |
| a | 2158582 | |
| y | 1901539 | |
| u | 1049649 | |
| r | 843730 | 6.2% |
| s | 840038 | 6.2% |
| e | 833457 | 6.1% |
| n | 778064 | 5.7% |
| T | 563182 | 4.1% |
| S | 486467 | 3.6% |
| Other values (7) | 1967404 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 13600507 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| d | 2178395 | |
| a | 2158582 | |
| y | 1901539 | |
| u | 1049649 | |
| r | 843730 | 6.2% |
| s | 840038 | 6.2% |
| e | 833457 | 6.1% |
| n | 778064 | 5.7% |
| T | 563182 | 4.1% |
| S | 486467 | 3.6% |
| Other values (7) | 1967404 |
crash_month
Categorical
HIGH CORRELATION 
| Distinct | 12 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 129.1 MiB |
| October | |
|---|---|
| July | |
| September | |
| August | |
| December | |
| Other values (7) |
Length
| Max length | 9 |
|---|---|
| Median length | 7 |
| Mean length | 6.1817133 |
| Min length | 3 |
Characters and Unicode
| Total characters | 11754769 |
|---|---|
| Distinct characters | 26 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | November |
|---|---|
| 2nd row | September |
| 3rd row | December |
| 4th row | December |
| 5th row | December |
Common Values
| Value | Count | Frequency (%) |
| October | 172792 | |
| July | 168803 | |
| September | 167637 | |
| August | 167294 | |
| December | 165401 | |
| November | 164778 | |
| June | 160323 | |
| May | 158408 | |
| March | 152600 | |
| January | 151778 | |
| Other values (2) | 271725 |
Length
| Value | Count | Frequency (%) |
| october | 172792 | |
| july | 168803 | |
| september | 167637 | |
| august | 167294 | |
| december | 165401 | |
| november | 164778 | |
| june | 160323 | |
| may | 158408 | |
| march | 152600 | |
| january | 151778 | |
| Other values (2) | 271725 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 1800284 | |
| r | 1385210 | |
| u | 953991 | 8.1% |
| b | 809107 | 6.9% |
| a | 753063 | 6.4% |
| y | 617488 | 5.3% |
| t | 507723 | 4.3% |
| m | 497816 | 4.2% |
| c | 490793 | 4.2% |
| J | 480904 | 4.1% |
| Other values (16) | 3458390 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 9853230 | |
| Uppercase Letter | 1901539 | 16.2% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 1800284 | |
| r | 1385210 | |
| u | 953991 | |
| b | 809107 | |
| a | 753063 | |
| y | 617488 | 6.3% |
| t | 507723 | 5.2% |
| m | 497816 | 5.1% |
| c | 490793 | 5.0% |
| o | 337570 | 3.4% |
| Other values (8) | 1700185 |
Uppercase Letter
| Value | Count | Frequency (%) |
| J | 480904 | |
| M | 311008 | |
| A | 300520 | |
| O | 172792 | 9.1% |
| S | 167637 | 8.8% |
| D | 165401 | 8.7% |
| N | 164778 | 8.7% |
| F | 138499 | 7.3% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 11754769 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 1800284 | |
| r | 1385210 | |
| u | 953991 | 8.1% |
| b | 809107 | 6.9% |
| a | 753063 | 6.4% |
| y | 617488 | 5.3% |
| t | 507723 | 4.3% |
| m | 497816 | 4.2% |
| c | 490793 | 4.2% |
| J | 480904 | 4.1% |
| Other values (16) | 3458390 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 11754769 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 1800284 | |
| r | 1385210 | |
| u | 953991 | 8.1% |
| b | 809107 | 6.9% |
| a | 753063 | 6.4% |
| y | 617488 | 5.3% |
| t | 507723 | 4.3% |
| m | 497816 | 4.2% |
| c | 490793 | 4.2% |
| J | 480904 | 4.1% |
| Other values (16) | 3458390 |
crash_year
Real number (ℝ)
HIGH CORRELATION 
| Distinct | 14 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2017.4449 |
| Minimum | 2012 |
|---|---|
| Maximum | 2025 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 29.0 MiB |
Quantile statistics
| Minimum | 2012 |
|---|---|
| 5-th percentile | 2013 |
| Q1 | 2015 |
| median | 2017 |
| Q3 | 2020 |
| 95-th percentile | 2024 |
| Maximum | 2025 |
| Range | 13 |
| Interquartile range (IQR) | 5 |
Descriptive statistics
| Standard deviation | 3.3444729 |
|---|---|
| Coefficient of variation (CV) | 0.0016577766 |
| Kurtosis | -0.73977256 |
| Mean | 2017.4449 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | 0.33249917 |
| Sum | 3.8362501 × 109 |
| Variance | 11.185499 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 2017 | 212566 | |
| 2018 | 212516 | |
| 2019 | 192239 | |
| 2016 | 188544 | |
| 2015 | 182642 | |
| 2014 | 171836 | |
| 2013 | 171563 | |
| 2020 | 102669 | 5.4% |
| 2021 | 99559 | 5.2% |
| 2022 | 91590 | 4.8% |
| Other values (4) | 275815 |
| Value | Count | Frequency (%) |
| 2012 | 85341 | |
| 2013 | 171563 | |
| 2014 | 171836 | |
| 2015 | 182642 | |
| 2016 | 188544 | |
| 2017 | 212566 | |
| 2018 | 212516 | |
| 2019 | 192239 | |
| 2020 | 102669 | |
| 2021 | 99559 |
| Value | Count | Frequency (%) |
| 2025 | 21115 | 1.1% |
| 2024 | 81948 | 4.3% |
| 2023 | 87411 | |
| 2022 | 91590 | |
| 2021 | 99559 | |
| 2020 | 102669 | |
| 2019 | 192239 | |
| 2018 | 212516 | |
| 2017 | 212566 | |
| 2016 | 188544 |
holiday_name
Categorical
HIGH CORRELATION  MISSING 
| Distinct | 12 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 1856022 |
| Missing (%) | 97.6% |
| Memory size | 74.3 MiB |
| Veterans Day | |
|---|---|
| Lincoln's Birthday | |
| Labour Day | |
| Columbus Day | |
| Independence Day | |
| Other values (7) |
Length
| Max length | 36 |
|---|---|
| Median length | 21 |
| Mean length | 15.858427 |
| Min length | 10 |
Characters and Unicode
| Total characters | 721828 |
|---|---|
| Distinct characters | 38 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Labour Day |
|---|---|
| 2nd row | Labour Day |
| 3rd row | Independence Day |
| 4th row | Independence Day |
| 5th row | Independence Day |
Common Values
| Value | Count | Frequency (%) |
| Veterans Day | 5076 | 0.3% |
| Lincoln's Birthday | 4748 | 0.2% |
| Labour Day | 4433 | 0.2% |
| Columbus Day | 4416 | 0.2% |
| Independence Day | 4205 | 0.2% |
| Martin Luther King, Jr. Day | 3815 | 0.2% |
| Thanksgiving Day | 3748 | 0.2% |
| New Year's Day | 3745 | 0.2% |
| Memorial Day | 3688 | 0.2% |
| Washington's Birthday | 3512 | 0.2% |
| Other values (2) | 4131 | 0.2% |
| (Missing) | 1856022 |
Length
| Value | Count | Frequency (%) |
| day | 37257 | |
| birthday | 8260 | 7.6% |
| independence | 5237 | 4.8% |
| veterans | 5076 | 4.7% |
| lincoln's | 4748 | 4.4% |
| labour | 4433 | 4.1% |
| columbus | 4416 | 4.1% |
| jr | 3815 | 3.5% |
| king | 3815 | 3.5% |
| luther | 3815 | 3.5% |
| Other values (9) | 27416 |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 78697 | 10.9% |
| 62771 | 8.7% | |
| n | 55529 | 7.7% |
| e | 49189 | 6.8% |
| y | 45517 | 6.3% |
| r | 39746 | 5.5% |
| i | 39465 | 5.5% |
| D | 37257 | 5.2% |
| s | 34955 | 4.8% |
| t | 30673 | 4.2% |
| Other values (28) | 248029 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 531134 | |
| Uppercase Letter | 108288 | 15.0% |
| Space Separator | 62771 | 8.7% |
| Other Punctuation | 19635 | 2.7% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 78697 | |
| n | 55529 | |
| e | 49189 | |
| y | 45517 | |
| r | 39746 | 7.5% |
| i | 39465 | 7.4% |
| s | 34955 | 6.6% |
| t | 30673 | 5.8% |
| h | 23466 | 4.4% |
| o | 21829 | 4.1% |
| Other values (11) | 112068 |
Uppercase Letter
| Value | Count | Frequency (%) |
| D | 37257 | |
| L | 12996 | 12.0% |
| B | 8260 | 7.6% |
| C | 7515 | 6.9% |
| M | 7503 | 6.9% |
| I | 5237 | 4.8% |
| V | 5076 | 4.7% |
| J | 4847 | 4.5% |
| N | 4777 | 4.4% |
| K | 3815 | 3.5% |
| Other values (3) | 11005 | 10.2% |
Other Punctuation
| Value | Count | Frequency (%) |
| ' | 12005 | |
| , | 3815 | 19.4% |
| . | 3815 | 19.4% |
Space Separator
| Value | Count | Frequency (%) |
| 62771 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 639422 | |
| Common | 82406 | 11.4% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| a | 78697 | 12.3% |
| n | 55529 | 8.7% |
| e | 49189 | 7.7% |
| y | 45517 | 7.1% |
| r | 39746 | 6.2% |
| i | 39465 | 6.2% |
| D | 37257 | 5.8% |
| s | 34955 | 5.5% |
| t | 30673 | 4.8% |
| h | 23466 | 3.7% |
| Other values (24) | 204928 |
Common
| Value | Count | Frequency (%) |
| 62771 | ||
| ' | 12005 | 14.6% |
| , | 3815 | 4.6% |
| . | 3815 | 4.6% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 721828 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| a | 78697 | 10.9% |
| 62771 | 8.7% | |
| n | 55529 | 7.7% |
| e | 49189 | 6.8% |
| y | 45517 | 6.3% |
| r | 39746 | 5.5% |
| i | 39465 | 5.5% |
| D | 37257 | 5.2% |
| s | 34955 | 4.8% |
| t | 30673 | 4.2% |
| Other values (28) | 248029 |
is_public_holiday
Categorical
HIGH CORRELATION  IMBALANCE 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 119.7 MiB |
| 0 | |
|---|---|
| 1 | 45517 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 1901539 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 1856022 | |
| 1 | 45517 | 2.4% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0 | 1856022 | |
| 1 | 45517 | 2.4% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 1856022 | |
| 1 | 45517 | 2.4% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 1901539 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 1856022 | |
| 1 | 45517 | 2.4% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 1901539 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 1856022 | |
| 1 | 45517 | 2.4% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1901539 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 1856022 | |
| 1 | 45517 | 2.4% |
Number_of_involved_Vehicles
Categorical
IMBALANCE 
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 119.7 MiB |
| 2 | |
|---|---|
| 1 | |
| 3 | 101138 |
| 4 | 21859 |
| 5 | 8488 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 1901539 |
|---|---|
| Distinct characters | 5 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 3 |
|---|---|
| 2nd row | 1 |
| 3rd row | 2 |
| 4th row | 2 |
| 5th row | 1 |
Common Values
| Value | Count | Frequency (%) |
| 2 | 1394608 | |
| 1 | 375446 | 19.7% |
| 3 | 101138 | 5.3% |
| 4 | 21859 | 1.1% |
| 5 | 8488 | 0.4% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 2 | 1394608 | |
| 1 | 375446 | 19.7% |
| 3 | 101138 | 5.3% |
| 4 | 21859 | 1.1% |
| 5 | 8488 | 0.4% |
Most occurring characters
| Value | Count | Frequency (%) |
| 2 | 1394608 | |
| 1 | 375446 | 19.7% |
| 3 | 101138 | 5.3% |
| 4 | 21859 | 1.1% |
| 5 | 8488 | 0.4% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 1901539 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 2 | 1394608 | |
| 1 | 375446 | 19.7% |
| 3 | 101138 | 5.3% |
| 4 | 21859 | 1.1% |
| 5 | 8488 | 0.4% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 1901539 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 2 | 1394608 | |
| 1 | 375446 | 19.7% |
| 3 | 101138 | 5.3% |
| 4 | 21859 | 1.1% |
| 5 | 8488 | 0.4% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1901539 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 2 | 1394608 | |
| 1 | 375446 | 19.7% |
| 3 | 101138 | 5.3% |
| 4 | 21859 | 1.1% |
| 5 | 8488 | 0.4% |
geometry
Unsupported
REJECTED  UNSUPPORTED 
| Missing | 0 |
|---|---|
| Missing (%) | 0.0% |
| Memory size | 29.0 MiB |
BoroName
Categorical
HIGH CORRELATION 
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 131.4 MiB |
| Brooklyn | |
|---|---|
| Queens | |
| Manhattan | |
| Bronx | |
| Staten Island |
Length
| Max length | 13 |
|---|---|
| Median length | 9 |
| Mean length | 7.4405915 |
| Min length | 5 |
Characters and Unicode
| Total characters | 14148575 |
|---|---|
| Distinct characters | 20 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Brooklyn |
|---|---|
| 2nd row | Brooklyn |
| 3rd row | Brooklyn |
| 4th row | Bronx |
| 5th row | Brooklyn |
Common Values
| Value | Count | Frequency (%) |
| Brooklyn | 582400 | |
| Queens | 541364 | |
| Manhattan | 401197 | |
| Bronx | 283137 | |
| Staten Island | 93441 | 4.9% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| brooklyn | 582400 | |
| queens | 541364 | |
| manhattan | 401197 | |
| bronx | 283137 | |
| staten | 93441 | 4.7% |
| island | 93441 | 4.7% |
Most occurring characters
| Value | Count | Frequency (%) |
| n | 2396177 | |
| o | 1447937 | |
| a | 1390473 | |
| e | 1176169 | 8.3% |
| t | 989276 | 7.0% |
| B | 865537 | 6.1% |
| r | 865537 | 6.1% |
| l | 675841 | 4.8% |
| s | 634805 | 4.5% |
| y | 582400 | 4.1% |
| Other values (10) | 3124423 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 12060154 | |
| Uppercase Letter | 1994980 | 14.1% |
| Space Separator | 93441 | 0.7% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| n | 2396177 | |
| o | 1447937 | |
| a | 1390473 | |
| e | 1176169 | |
| t | 989276 | |
| r | 865537 | 7.2% |
| l | 675841 | 5.6% |
| s | 634805 | 5.3% |
| y | 582400 | 4.8% |
| k | 582400 | 4.8% |
| Other values (4) | 1319139 |
Uppercase Letter
| Value | Count | Frequency (%) |
| B | 865537 | |
| Q | 541364 | |
| M | 401197 | |
| S | 93441 | 4.7% |
| I | 93441 | 4.7% |
Space Separator
| Value | Count | Frequency (%) |
| 93441 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 14055134 | |
| Common | 93441 | 0.7% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| n | 2396177 | |
| o | 1447937 | |
| a | 1390473 | |
| e | 1176169 | 8.4% |
| t | 989276 | 7.0% |
| B | 865537 | 6.2% |
| r | 865537 | 6.2% |
| l | 675841 | 4.8% |
| s | 634805 | 4.5% |
| y | 582400 | 4.1% |
| Other values (9) | 3030982 |
Common
| Value | Count | Frequency (%) |
| 93441 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 14148575 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| n | 2396177 | |
| o | 1447937 | |
| a | 1390473 | |
| e | 1176169 | 8.3% |
| t | 989276 | 7.0% |
| B | 865537 | 6.1% |
| r | 865537 | 6.1% |
| l | 675841 | 4.8% |
| s | 634805 | 4.5% |
| y | 582400 | 4.1% |
| Other values (10) | 3124423 |
total_injured
Real number (ℝ)
HIGH CORRELATION  ZEROS 
| Distinct | 35 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.63588598 |
| Minimum | 0 |
|---|---|
| Maximum | 86 |
| Zeros | 1451188 |
| Zeros (%) | 76.3% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 29.0 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 4 |
| Maximum | 86 |
| Range | 86 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 1.4086404 |
|---|---|
| Coefficient of variation (CV) | 2.2152406 |
| Kurtosis | 45.009948 |
| Mean | 0.63588598 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 4.1077916 |
| Sum | 1209162 |
| Variance | 1.9842679 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 1451188 | |
| 2 | 342874 | 18.0% |
| 4 | 64999 | 3.4% |
| 6 | 21362 | 1.1% |
| 8 | 7850 | 0.4% |
| 1 | 7629 | 0.4% |
| 10 | 2990 | 0.2% |
| 12 | 1232 | 0.1% |
| 14 | 514 | < 0.1% |
| 3 | 276 | < 0.1% |
| Other values (25) | 625 | < 0.1% |
| Value | Count | Frequency (%) |
| 0 | 1451188 | |
| 1 | 7629 | 0.4% |
| 2 | 342874 | 18.0% |
| 3 | 276 | < 0.1% |
| 4 | 64999 | 3.4% |
| 5 | 30 | < 0.1% |
| 6 | 21362 | 1.1% |
| 7 | 8 | < 0.1% |
| 8 | 7850 | 0.4% |
| 9 | 3 | < 0.1% |
| Value | Count | Frequency (%) |
| 86 | 1 | < 0.1% |
| 68 | 1 | < 0.1% |
| 64 | 1 | < 0.1% |
| 54 | 1 | < 0.1% |
| 50 | 1 | < 0.1% |
| 48 | 3 | |
| 46 | 1 | < 0.1% |
| 44 | 3 | |
| 42 | 1 | < 0.1% |
| 40 | 2 |
total_killed
Real number (ℝ)
HIGH CORRELATION  SKEWED  ZEROS 
| Distinct | 8 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.0029828471 |
| Minimum | 0 |
|---|---|
| Maximum | 16 |
| Zeros | 1898794 |
| Zeros (%) | 99.9% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 29.0 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 0 |
| Maximum | 16 |
| Range | 16 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.08154706 |
|---|---|
| Coefficient of variation (CV) | 27.338666 |
| Kurtosis | 1996.4772 |
| Mean | 0.0029828471 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 34.152331 |
| Sum | 5672 |
| Variance | 0.006649923 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 1898794 | |
| 2 | 2597 | 0.1% |
| 4 | 71 | < 0.1% |
| 1 | 58 | < 0.1% |
| 6 | 13 | < 0.1% |
| 8 | 4 | < 0.1% |
| 16 | 1 | < 0.1% |
| 10 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 0 | 1898794 | |
| 1 | 58 | < 0.1% |
| 2 | 2597 | 0.1% |
| 4 | 71 | < 0.1% |
| 6 | 13 | < 0.1% |
| 8 | 4 | < 0.1% |
| 10 | 1 | < 0.1% |
| 16 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 16 | 1 | < 0.1% |
| 10 | 1 | < 0.1% |
| 8 | 4 | < 0.1% |
| 6 | 13 | < 0.1% |
| 4 | 71 | < 0.1% |
| 2 | 2597 | 0.1% |
| 1 | 58 | < 0.1% |
| 0 | 1898794 |
severity
Categorical
HIGH CORRELATION 
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 135.7 MiB |
| No Casualty | |
|---|---|
| Injury | |
| Fatal | 2745 |
Length
| Max length | 11 |
|---|---|
| Median length | 11 |
| Mean length | 9.8089647 |
| Min length | 5 |
Characters and Unicode
| Total characters | 18652129 |
|---|---|
| Distinct characters | 15 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Injury |
|---|---|
| 2nd row | No Casualty |
| 3rd row | No Casualty |
| 4th row | Injury |
| 5th row | No Casualty |
Common Values
| Value | Count | Frequency (%) |
| No Casualty | 1449128 | |
| Injury | 449666 | 23.6% |
| Fatal | 2745 | 0.1% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| no | 1449128 | |
| casualty | 1449128 | |
| injury | 449666 | 13.4% |
| fatal | 2745 | 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 2903746 | |
| u | 1898794 | |
| y | 1898794 | |
| l | 1451873 | |
| t | 1451873 | |
| N | 1449128 | |
| o | 1449128 | |
| 1449128 | ||
| C | 1449128 | |
| s | 1449128 | |
| Other values (5) | 1801409 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 13852334 | |
| Uppercase Letter | 3350667 | 18.0% |
| Space Separator | 1449128 | 7.8% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 2903746 | |
| u | 1898794 | |
| y | 1898794 | |
| l | 1451873 | |
| t | 1451873 | |
| o | 1449128 | |
| s | 1449128 | |
| n | 449666 | 3.2% |
| j | 449666 | 3.2% |
| r | 449666 | 3.2% |
Uppercase Letter
| Value | Count | Frequency (%) |
| N | 1449128 | |
| C | 1449128 | |
| I | 449666 | 13.4% |
| F | 2745 | 0.1% |
Space Separator
| Value | Count | Frequency (%) |
| 1449128 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 17203001 | |
| Common | 1449128 | 7.8% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| a | 2903746 | |
| u | 1898794 | |
| y | 1898794 | |
| l | 1451873 | |
| t | 1451873 | |
| N | 1449128 | |
| o | 1449128 | |
| C | 1449128 | |
| s | 1449128 | |
| I | 449666 | 2.6% |
| Other values (4) | 1351743 |
Common
| Value | Count | Frequency (%) |
| 1449128 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 18652129 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| a | 2903746 | |
| u | 1898794 | |
| y | 1898794 | |
| l | 1451873 | |
| t | 1451873 | |
| N | 1449128 | |
| o | 1449128 | |
| 1449128 | ||
| C | 1449128 | |
| s | 1449128 | |
| Other values (5) | 1801409 |
location_type
Categorical
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 138.0 MiB |
| intersection | |
|---|---|
| off_street | |
| mid_block |
Length
| Max length | 12 |
|---|---|
| Median length | 12 |
| Mean length | 11.085937 |
| Min length | 9 |
Characters and Unicode
| Total characters | 21080341 |
|---|---|
| Distinct characters | 15 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | intersection |
|---|---|
| 2nd row | off_street |
| 3rd row | mid_block |
| 4th row | off_street |
| 5th row | off_street |
Common Values
| Value | Count | Frequency (%) |
| intersection | 1187357 | |
| off_street | 404419 | 21.3% |
| mid_block | 309763 | 16.3% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| intersection | 1187357 | |
| off_street | 404419 | 21.3% |
| mid_block | 309763 | 16.3% |
Most occurring characters
| Value | Count | Frequency (%) |
| t | 3183552 | |
| e | 3183552 | |
| i | 2684477 | |
| n | 2374714 | |
| o | 1901539 | |
| r | 1591776 | |
| s | 1591776 | |
| c | 1497120 | |
| f | 808838 | 3.8% |
| _ | 714182 | 3.4% |
| Other values (5) | 1548815 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 20366159 | |
| Connector Punctuation | 714182 | 3.4% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| t | 3183552 | |
| e | 3183552 | |
| i | 2684477 | |
| n | 2374714 | |
| o | 1901539 | |
| r | 1591776 | |
| s | 1591776 | |
| c | 1497120 | |
| f | 808838 | 4.0% |
| m | 309763 | 1.5% |
| Other values (4) | 1239052 | 6.1% |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 714182 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 20366159 | |
| Common | 714182 | 3.4% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| t | 3183552 | |
| e | 3183552 | |
| i | 2684477 | |
| n | 2374714 | |
| o | 1901539 | |
| r | 1591776 | |
| s | 1591776 | |
| c | 1497120 | |
| f | 808838 | 4.0% |
| m | 309763 | 1.5% |
| Other values (4) | 1239052 | 6.1% |
Common
| Value | Count | Frequency (%) |
| _ | 714182 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 21080341 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| t | 3183552 | |
| e | 3183552 | |
| i | 2684477 | |
| n | 2374714 | |
| o | 1901539 | |
| r | 1591776 | |
| s | 1591776 | |
| c | 1497120 | |
| f | 808838 | 3.8% |
| _ | 714182 | 3.4% |
| Other values (5) | 1548815 |
| latitude | longitude | number_of_persons_injured | number_of_persons_killed | number_of_pedestrians_injured | number_of_pedestrians_killed | number_of_motorist_injured | number_of_motorist_killed | collision_id | crash_hour | crash_year | total_injured | total_killed | number_of_cyclist_injured | number_of_cyclist_killed | contributing_factor_vehicle_3 | contributing_factor_vehicle_4 | contributing_factor_vehicle_5 | crash_day | crash_month | holiday_name | is_public_holiday | Number_of_involved_Vehicles | BoroName | severity | location_type | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| latitude | 1.000 | 0.294 | -0.026 | -0.001 | 0.003 | -0.001 | -0.032 | -0.001 | -0.011 | -0.010 | -0.003 | -0.026 | -0.001 | 0.017 | 0.001 | 0.094 | 0.108 | 0.086 | 0.011 | 0.006 | 0.030 | 0.013 | 0.037 | 0.648 | 0.043 | 0.066 |
| longitude | 0.294 | 1.000 | 0.038 | 0.003 | -0.017 | 0.000 | 0.076 | 0.006 | 0.064 | -0.008 | 0.055 | 0.038 | 0.003 | 0.037 | 0.002 | 0.069 | 0.066 | 0.030 | 0.013 | 0.005 | 0.023 | 0.009 | 0.037 | 0.686 | 0.033 | 0.054 |
| number_of_persons_injured | -0.026 | 0.038 | 1.000 | 0.002 | 0.408 | -0.004 | 0.780 | 0.012 | 0.148 | 0.034 | 0.145 | 0.999 | 0.002 | 0.004 | 0.043 | 0.000 | 0.074 | 0.090 | 0.007 | 0.003 | 0.013 | 0.005 | 0.039 | 0.008 | 0.069 | 0.010 |
| number_of_persons_killed | -0.001 | 0.003 | 0.002 | 1.000 | -0.002 | 0.717 | 0.008 | 0.618 | 0.011 | -0.004 | 0.011 | 0.003 | 0.999 | 0.005 | 0.737 | 0.113 | 0.000 | 0.000 | 0.002 | 0.000 | 0.011 | 0.002 | 0.021 | 0.002 | 0.707 | 0.010 |
| number_of_pedestrians_injured | 0.003 | -0.017 | 0.408 | -0.002 | 1.000 | 0.002 | -0.089 | -0.004 | 0.023 | 0.034 | 0.025 | 0.410 | -0.002 | 0.000 | 0.167 | 0.000 | 0.690 | 0.000 | 0.001 | 0.000 | 0.005 | 0.005 | 0.012 | 0.002 | 0.028 | 0.003 |
| number_of_pedestrians_killed | -0.001 | 0.000 | -0.004 | 0.717 | 0.002 | 1.000 | -0.003 | 0.003 | 0.004 | -0.000 | 0.004 | -0.004 | 0.716 | 0.002 | 0.707 | 0.029 | 0.176 | 1.000 | 0.001 | 0.002 | 0.000 | 0.004 | 0.024 | 0.000 | 0.507 | 0.006 |
| number_of_motorist_injured | -0.032 | 0.076 | 0.780 | 0.008 | -0.089 | -0.003 | 1.000 | 0.018 | 0.119 | -0.000 | 0.115 | 0.783 | 0.008 | 0.004 | 0.000 | 0.000 | 0.043 | 0.102 | 0.007 | 0.003 | 0.015 | 0.005 | 0.038 | 0.008 | 0.066 | 0.011 |
| number_of_motorist_killed | -0.001 | 0.006 | 0.012 | 0.618 | -0.004 | 0.003 | 0.018 | 1.000 | 0.008 | -0.006 | 0.008 | 0.012 | 0.619 | 0.001 | 0.000 | 0.031 | 0.000 | 0.000 | 0.004 | 0.001 | 0.014 | 0.000 | 0.011 | 0.004 | 0.438 | 0.008 |
| collision_id | -0.011 | 0.064 | 0.148 | 0.011 | 0.023 | 0.004 | 0.119 | 0.008 | 1.000 | -0.030 | 0.991 | 0.146 | 0.011 | 0.039 | 0.004 | 0.213 | 0.263 | 0.354 | 0.016 | 0.108 | 0.152 | 0.013 | 0.114 | 0.059 | 0.131 | 0.305 |
| crash_hour | -0.010 | -0.008 | 0.034 | -0.004 | 0.034 | -0.000 | -0.000 | -0.006 | -0.030 | 1.000 | -0.031 | 0.034 | -0.004 | 0.023 | 0.003 | 0.065 | 0.093 | 0.115 | 0.075 | 0.016 | 0.069 | 0.029 | 0.055 | 0.027 | 0.049 | 0.041 |
| crash_year | -0.003 | 0.055 | 0.145 | 0.011 | 0.025 | 0.004 | 0.115 | 0.008 | 0.991 | -0.031 | 1.000 | 0.143 | 0.011 | 0.038 | 0.005 | 0.197 | 0.206 | 0.200 | 0.011 | 0.050 | 0.101 | 0.018 | 0.114 | 0.050 | 0.133 | 0.285 |
| total_injured | -0.026 | 0.038 | 0.999 | 0.003 | 0.410 | -0.004 | 0.783 | 0.012 | 0.146 | 0.034 | 0.143 | 1.000 | 0.003 | 0.004 | 0.043 | 0.000 | 0.074 | 0.090 | 0.007 | 0.003 | 0.013 | 0.005 | 0.039 | 0.008 | 0.069 | 0.010 |
| total_killed | -0.001 | 0.003 | 0.002 | 0.999 | -0.002 | 0.716 | 0.008 | 0.619 | 0.011 | -0.004 | 0.011 | 0.003 | 1.000 | 0.005 | 0.738 | 0.113 | 0.000 | 0.000 | 0.002 | 0.000 | 0.011 | 0.002 | 0.021 | 0.002 | 0.700 | 0.010 |
| number_of_cyclist_injured | 0.017 | 0.037 | 0.004 | 0.005 | 0.000 | 0.002 | 0.004 | 0.001 | 0.039 | 0.023 | 0.038 | 0.004 | 0.005 | 1.000 | 0.021 | 0.209 | 0.000 | 0.000 | 0.002 | 0.026 | 0.049 | 0.001 | 0.030 | 0.032 | 0.221 | 0.020 |
| number_of_cyclist_killed | 0.001 | 0.002 | 0.043 | 0.737 | 0.167 | 0.707 | 0.000 | 0.000 | 0.004 | 0.003 | 0.005 | 0.043 | 0.738 | 0.021 | 1.000 | 1.000 | 1.000 | 1.000 | 0.000 | 0.002 | 0.011 | 0.000 | 0.008 | 0.002 | 0.206 | 0.003 |
| contributing_factor_vehicle_3 | 0.094 | 0.069 | 0.000 | 0.113 | 0.000 | 0.029 | 0.000 | 0.031 | 0.213 | 0.065 | 0.197 | 0.000 | 0.113 | 0.209 | 1.000 | 1.000 | 0.738 | 0.796 | 0.023 | 0.058 | 0.000 | 0.000 | 0.132 | 0.110 | 0.078 | 0.389 |
| contributing_factor_vehicle_4 | 0.108 | 0.066 | 0.074 | 0.000 | 0.690 | 0.176 | 0.043 | 0.000 | 0.263 | 0.093 | 0.206 | 0.074 | 0.000 | 0.000 | 1.000 | 0.738 | 1.000 | 0.807 | 0.026 | 0.084 | 0.000 | 0.000 | 0.088 | 0.129 | 0.000 | 0.440 |
| contributing_factor_vehicle_5 | 0.086 | 0.030 | 0.090 | 0.000 | 0.000 | 1.000 | 0.102 | 0.000 | 0.354 | 0.115 | 0.200 | 0.090 | 0.000 | 0.000 | 1.000 | 0.796 | 0.807 | 1.000 | 0.082 | 0.131 | 0.264 | 0.150 | 0.056 | 0.163 | 0.000 | 0.473 |
| crash_day | 0.011 | 0.013 | 0.007 | 0.002 | 0.001 | 0.001 | 0.007 | 0.004 | 0.016 | 0.075 | 0.011 | 0.007 | 0.002 | 0.002 | 0.000 | 0.023 | 0.026 | 0.082 | 1.000 | 0.012 | 0.466 | 0.198 | 0.021 | 0.012 | 0.008 | 0.010 |
| crash_month | 0.006 | 0.005 | 0.003 | 0.000 | 0.000 | 0.002 | 0.003 | 0.001 | 0.108 | 0.016 | 0.050 | 0.003 | 0.000 | 0.026 | 0.002 | 0.058 | 0.084 | 0.131 | 0.012 | 1.000 | 0.994 | 0.128 | 0.012 | 0.009 | 0.017 | 0.025 |
| holiday_name | 0.030 | 0.023 | 0.013 | 0.011 | 0.005 | 0.000 | 0.015 | 0.014 | 0.152 | 0.069 | 0.101 | 0.013 | 0.011 | 0.049 | 0.011 | 0.000 | 0.000 | 0.264 | 0.466 | 0.994 | 1.000 | 1.000 | 0.032 | 0.037 | 0.053 | 0.049 |
| is_public_holiday | 0.013 | 0.009 | 0.005 | 0.002 | 0.005 | 0.004 | 0.005 | 0.000 | 0.013 | 0.029 | 0.018 | 0.005 | 0.002 | 0.001 | 0.000 | 0.000 | 0.000 | 0.150 | 0.198 | 0.128 | 1.000 | 1.000 | 0.011 | 0.011 | 0.004 | 0.000 |
| Number_of_involved_Vehicles | 0.037 | 0.037 | 0.039 | 0.021 | 0.012 | 0.024 | 0.038 | 0.011 | 0.114 | 0.055 | 0.114 | 0.039 | 0.021 | 0.030 | 0.008 | 0.132 | 0.088 | 0.056 | 0.021 | 0.012 | 0.032 | 0.011 | 1.000 | 0.041 | 0.137 | 0.102 |
| BoroName | 0.648 | 0.686 | 0.008 | 0.002 | 0.002 | 0.000 | 0.008 | 0.004 | 0.059 | 0.027 | 0.050 | 0.008 | 0.002 | 0.032 | 0.002 | 0.110 | 0.129 | 0.163 | 0.012 | 0.009 | 0.037 | 0.011 | 0.041 | 1.000 | 0.043 | 0.041 |
| severity | 0.043 | 0.033 | 0.069 | 0.707 | 0.028 | 0.507 | 0.066 | 0.438 | 0.131 | 0.049 | 0.133 | 0.069 | 0.700 | 0.221 | 0.206 | 0.078 | 0.000 | 0.000 | 0.008 | 0.017 | 0.053 | 0.004 | 0.137 | 0.043 | 1.000 | 0.058 |
| location_type | 0.066 | 0.054 | 0.010 | 0.010 | 0.003 | 0.006 | 0.011 | 0.008 | 0.305 | 0.041 | 0.285 | 0.010 | 0.010 | 0.020 | 0.003 | 0.389 | 0.440 | 0.473 | 0.010 | 0.025 | 0.049 | 0.000 | 0.102 | 0.041 | 0.058 | 1.000 |
| crash_date | crash_time | zip_code | latitude | longitude | location | on_street_name | cross_street_name | off_street_name | number_of_persons_injured | number_of_persons_killed | number_of_pedestrians_injured | number_of_pedestrians_killed | number_of_cyclist_injured | number_of_cyclist_killed | number_of_motorist_injured | number_of_motorist_killed | contributing_factor_vehicle_1 | contributing_factor_vehicle_2 | contributing_factor_vehicle_3 | contributing_factor_vehicle_4 | contributing_factor_vehicle_5 | collision_id | vehicle_type_code_1 | vehicle_type_code_2 | vehicle_type_code_3 | vehicle_type_code_4 | vehicle_type_code_5 | crash_hour | crash_day | crash_month | crash_year | holiday_name | is_public_holiday | Number_of_involved_Vehicles | geometry | BoroName | total_injured | total_killed | severity | location_type | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2 | 2023-11-01 | 01:29:00 | 11230 | 40.621790 | -73.970024 | (40.62179, -73.970024) | OCEAN PARKWAY | AVENUE K | NaN | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | None | None | None | None | None | 4675373 | Moped | Sedan | Sedan | NaN | NaN | 1 | Wednesday | November | 2023 | NaN | 0 | 3 | POINT (-73.97002 40.62179) | Brooklyn | 2.0 | 0.0 | Injury | intersection |
| 9 | 2021-09-11 | 09:35:00 | 11208 | 40.667202 | -73.866500 | (40.667202, -73.8665) | NaN | NaN | 1211 LORING AVENUE | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | None | None | None | None | None | 4456314 | Sedan | NaN | NaN | NaN | NaN | 9 | Saturday | September | 2021 | NaN | 0 | 1 | POINT (-73.8665 40.6672) | Brooklyn | 0.0 | 0.0 | No Casualty | off_street |
| 12 | 2021-12-14 | 17:05:00 | NaN | 40.709183 | -73.956825 | (40.709183, -73.956825) | BROOKLYN QUEENS EXPRESSWAY | NaN | NaN | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | passing too closely | None | None | None | None | 4486555 | Sedan | Tractor Truck Diesel | NaN | NaN | NaN | 17 | Tuesday | December | 2021 | NaN | 0 | 2 | POINT (-73.95682 40.70918) | Brooklyn | 0.0 | 0.0 | No Casualty | mid_block |
| 13 | 2021-12-14 | 08:17:00 | 10475 | 40.868160 | -73.831480 | (40.86816, -73.83148) | NaN | NaN | 344 BAYCHESTER AVENUE | 2 | 0 | 0 | 0 | 0 | 0 | 2 | 0 | None | None | None | None | None | 4486660 | Sedan | Sedan | NaN | NaN | NaN | 8 | Tuesday | December | 2021 | NaN | 0 | 2 | POINT (-73.83148 40.86816) | Bronx | 4.0 | 0.0 | Injury | off_street |
| 14 | 2021-12-14 | 21:10:00 | 11207 | 40.671720 | -73.897100 | (40.67172, -73.8971) | NaN | NaN | 2047 PITKIN AVENUE | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | driver inexperience | None | None | None | None | 4487074 | Sedan | NaN | NaN | NaN | NaN | 21 | Tuesday | December | 2021 | NaN | 0 | 1 | POINT (-73.8971 40.67172) | Brooklyn | 0.0 | 0.0 | No Casualty | off_street |
| 15 | 2021-12-14 | 14:58:00 | 10017 | 40.751440 | -73.973970 | (40.75144, -73.97397) | 3 AVENUE | EAST 43 STREET | NaN | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | passing too closely | None | None | None | None | 4486519 | Sedan | Station Wagon/Sport Utility Vehicle | NaN | NaN | NaN | 14 | Tuesday | December | 2021 | NaN | 0 | 2 | POINT (-73.97397 40.75144) | Manhattan | 0.0 | 0.0 | No Casualty | intersection |
| 16 | 2021-12-13 | 00:34:00 | NaN | 40.701275 | -73.888870 | (40.701275, -73.88887) | MYRTLE AVENUE | NaN | NaN | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | passing or lane usage improper | None | None | None | None | 4486934 | Station Wagon/Sport Utility Vehicle | NaN | NaN | NaN | NaN | 0 | Monday | December | 2021 | NaN | 0 | 1 | POINT (-73.88887 40.70128) | Queens | 0.0 | 0.0 | No Casualty | mid_block |
| 17 | 2021-12-14 | 16:50:00 | 11413 | 40.675884 | -73.755770 | (40.675884, -73.75577) | SPRINGFIELD BOULEVARD | EAST GATE PLAZA | NaN | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | turning improperly | None | None | None | None | 4487127 | Sedan | Station Wagon/Sport Utility Vehicle | NaN | NaN | NaN | 16 | Tuesday | December | 2021 | NaN | 0 | 2 | POINT (-73.75577 40.67588) | Queens | 0.0 | 0.0 | No Casualty | intersection |
| 19 | 2021-12-14 | 00:59:00 | NaN | 40.596620 | -74.002310 | (40.59662, -74.00231) | BELT PARKWAY | NaN | NaN | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | unsafe speed | None | None | None | None | 4486564 | Sedan | NaN | NaN | NaN | NaN | 0 | Tuesday | December | 2021 | NaN | 0 | 1 | POINT (-74.00231 40.59662) | Brooklyn | 0.0 | 0.0 | No Casualty | mid_block |
| 20 | 2021-12-14 | 23:10:00 | 11434 | 40.666840 | -73.789410 | (40.66684, -73.78941) | NORTH CONDUIT AVENUE | 150 STREET | NaN | 2 | 0 | 0 | 0 | 0 | 0 | 2 | 0 | reaction to uninvolved vehicle | None | None | None | None | 4486635 | Sedan | Sedan | NaN | NaN | NaN | 23 | Tuesday | December | 2021 | NaN | 0 | 2 | POINT (-73.78941 40.66684) | Queens | 4.0 | 0.0 | Injury | intersection |
| crash_date | crash_time | zip_code | latitude | longitude | location | on_street_name | cross_street_name | off_street_name | number_of_persons_injured | number_of_persons_killed | number_of_pedestrians_injured | number_of_pedestrians_killed | number_of_cyclist_injured | number_of_cyclist_killed | number_of_motorist_injured | number_of_motorist_killed | contributing_factor_vehicle_1 | contributing_factor_vehicle_2 | contributing_factor_vehicle_3 | contributing_factor_vehicle_4 | contributing_factor_vehicle_5 | collision_id | vehicle_type_code_1 | vehicle_type_code_2 | vehicle_type_code_3 | vehicle_type_code_4 | vehicle_type_code_5 | crash_hour | crash_day | crash_month | crash_year | holiday_name | is_public_holiday | Number_of_involved_Vehicles | geometry | BoroName | total_injured | total_killed | severity | location_type | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2169675 | 2025-04-09 | 19:08:00 | 10003 | 40.736020 | -73.98227 | (40.73602, -73.98227) | E 20 ST | 2 AVE | NaN | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | unsafe speed | None | None | None | None | 4806318 | Bike | Bike | NaN | NaN | NaN | 19 | Wednesday | April | 2025 | NaN | 0 | 2 | POINT (-73.98227 40.73602) | Manhattan | 2.0 | 0.0 | Injury | intersection |
| 2169676 | 2025-04-15 | 18:31:00 | 11418 | 40.697830 | -73.83564 | (40.69783, -73.83564) | 113 ST | JAMAICA AVE | NaN | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | backing unsafely | None | None | None | None | 4806035 | Sedan | Station Wagon/Sport Utility Vehicle | NaN | NaN | NaN | 18 | Tuesday | April | 2025 | NaN | 0 | 2 | POINT (-73.83564 40.69783) | Queens | 0.0 | 0.0 | No Casualty | intersection |
| 2169677 | 2025-04-15 | 15:52:00 | 10461 | 40.854298 | -73.85492 | (40.854298, -73.85492) | NaN | NaN | 2007 WILLIAMSBRIDGE RD | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | driver inattention/distraction | None | None | None | None | 4805948 | Sedan | Sedan | NaN | NaN | NaN | 15 | Tuesday | April | 2025 | NaN | 0 | 2 | POINT (-73.85492 40.8543) | Bronx | 0.0 | 0.0 | No Casualty | off_street |
| 2169678 | 2025-04-15 | 20:00:00 | 11366 | 40.728012 | -73.78483 | (40.728012, -73.78483) | UNION TPKE | 184 ST | NaN | 2 | 0 | 0 | 0 | 0 | 0 | 2 | 0 | traffic control disregarded | None | None | None | None | 4806383 | Station Wagon/Sport Utility Vehicle | Station Wagon/Sport Utility Vehicle | NaN | NaN | NaN | 20 | Tuesday | April | 2025 | NaN | 0 | 2 | POINT (-73.78483 40.72801) | Queens | 4.0 | 0.0 | Injury | intersection |
| 2169679 | 2025-04-15 | 14:30:00 | 10036 | 40.757553 | -73.98551 | (40.757553, -73.98551) | NaN | NaN | 1516 BROADWAY | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | traffic control disregarded | None | None | None | None | 4806096 | Station Wagon/Sport Utility Vehicle | NaN | NaN | NaN | NaN | 14 | Tuesday | April | 2025 | NaN | 0 | 1 | POINT (-73.98551 40.75755) | Manhattan | 2.0 | 0.0 | Injury | off_street |
| 2169680 | 2025-04-15 | 23:20:00 | 11691 | 40.610480 | -73.75028 | (40.61048, -73.75028) | NaN | NaN | 12-50 REDFERN AVE | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | view obstructed/limited | None | None | None | None | 4806081 | Station Wagon/Sport Utility Vehicle | NaN | NaN | NaN | NaN | 23 | Tuesday | April | 2025 | NaN | 0 | 1 | POINT (-73.75028 40.61048) | Queens | 0.0 | 0.0 | No Casualty | off_street |
| 2169681 | 2025-04-07 | 08:50:00 | 11221 | 40.695114 | -73.91186 | (40.695114, -73.91186) | PUTNAM AVE | KNICKERBOCKER AVE | NaN | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | backing unsafely | None | None | None | None | 4806432 | Sedan | NaN | NaN | NaN | NaN | 8 | Monday | April | 2025 | NaN | 0 | 1 | POINT (-73.91186 40.69511) | Brooklyn | 0.0 | 0.0 | No Casualty | intersection |
| 2169682 | 2025-04-15 | 05:58:00 | NaN | 40.761272 | -73.95571 | (40.761272, -73.95571) | FDR DRIVE | NaN | NaN | 2 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | driver inattention/distraction | None | None | None | None | 4806221 | Station Wagon/Sport Utility Vehicle | Station Wagon/Sport Utility Vehicle | NaN | NaN | NaN | 5 | Tuesday | April | 2025 | NaN | 0 | 2 | POINT (-73.95571 40.76127) | Manhattan | 4.0 | 0.0 | Injury | mid_block |
| 2169684 | 2025-04-14 | 21:25:00 | 11436 | 40.675716 | -73.79124 | (40.675716, -73.79124) | NaN | NaN | 147-06 123 AVE | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | turning improperly | None | None | None | None | 4806294 | Station Wagon/Sport Utility Vehicle | NaN | NaN | NaN | NaN | 21 | Monday | April | 2025 | NaN | 0 | 1 | POINT (-73.79124 40.67572) | Queens | 0.0 | 0.0 | No Casualty | off_street |
| 2169686 | 2025-03-23 | 13:00:00 | 10462 | 40.836330 | -73.85505 | (40.83633, -73.85505) | NaN | NaN | 1502 OLMSTEAD AVE | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | None | None | None | None | None | 4806253 | Sedan | NaN | NaN | NaN | NaN | 13 | Sunday | March | 2025 | NaN | 0 | 1 | POINT (-73.85505 40.83633) | Bronx | 0.0 | 0.0 | No Casualty | off_street |